您现在的位置： Linux教程網 >> UnixLinux > >> Linux編程 >> Linux編程

Java中由substring方法引發的內存洩漏

在Java中我們無須關心內存的釋放，JVM提供了內存管理機制，有垃圾回收器幫助回收不需要的對象。但實際中一些不當的使用仍然會導致一系列的內存問題，常見的就是內存洩漏和內存溢出

內存溢出（out of memory ）：通俗的說就是內存不夠用了，比如在一個無限循環中不斷創建一個大的對象，很快就會引發內存溢出。

內存洩漏（leak of memory）：是指為一個對象分配內存之後，在對象已經不在使用時未及時的釋放，導致一直占據內存單元，使實際可用內存減少，就好像內存洩漏了一樣。

由substring方法引發的內存洩漏

substring(int beginIndex, int endndex )是String類的一個方法，但是這個方法在JDK6和JDK7中的實現是完全不同的（雖然它們都達到了同樣的效果）。了解它們實現細節上的差異，能夠更好的幫助你使用它們，因為在JDK1.6中不當使用substring會導致嚴重的內存洩漏問題。

1、substring的作用

substring(int beginIndex, int endIndex)方法返回一個子字符串,從父字符串的beginIndex開始，結束於endindex-1。父字符串的下標從0開始，子字符串包含beginIndex而不包含endIndex。

String x= "abcdef";
x= str.substring(1,3);
System.out.println(x);

String x= "abcdef";
x= str.substring(1,3);
System.out.println(x);

上述程序的輸出是“bc”

2、實現原理

String類是不可變變，當上述第二句中x被重新賦值的時候，它會指向一個新的字符串對象，就像下面的這幅圖所示：

然而，這幅圖並沒有准確說明的或者代表堆中發生的實際情況，當substring被調用的時候真正發生的才是這兩者的差別。

JDK6中的substring實現

String對象被當作一個char數組來存儲，在String類中有3個域：char[] value、int offset、int count，分別用來存儲真實的字符數組，數組的起始位置，String的字符數。由這3個變量就可以決定一個字符串。當substring方法被調用的時候，它會創建一個新的字符串，但是上述的char數組value仍然會使用原來父數組的那個value。父數組和子數組的唯一差別就是count和offset的值不一樣，下面這張圖可以很形象的說明上述過程。

看一下JDK6中substring的實現源碼：

public String substring(int beginIndex, int endIndex) {
if (beginIndex < 0) {
throw new StringIndexOutOfBoundsException(beginIndex);
}
if (endIndex > count) {
throw new StringIndexOutOfBoundsException(endIndex);
}
if (beginIndex > endIndex) {
throw new StringIndexOutOfBoundsException(endIndex - beginIndex);
}
return ((beginIndex == 0) && (endIndex == count)) ? this :
new String(offset + beginIndex, endIndex - beginIndex, value); //使用的是和父字符串同一個char數組value
}

public String substring(int beginIndex, int endIndex) {
	if (beginIndex < 0) {
	    throw new StringIndexOutOfBoundsException(beginIndex);
	}
	if (endIndex > count) {
	    throw new StringIndexOutOfBoundsException(endIndex);
	}
	if (beginIndex > endIndex) {
	    throw new StringIndexOutOfBoundsException(endIndex - beginIndex);
	}
	return ((beginIndex == 0) && (endIndex == count)) ? this :
	    new String(offset + beginIndex, endIndex - beginIndex, value); //使用的是和父字符串同一個char數組value
    }

String(int offset, int count, char value[]) {
this.value = value;
this.offset = offset;
this.count = count;
}

String(int offset, int count, char value[]) {
	this.value = value;
	this.offset = offset;
	this.count = count;
    }

由此引發的內存洩漏洩漏情況：

String str = "abcdefghijklmnopqrst";
String sub = str.substring(1, 3);
str = null;

String str = "abcdefghijklmnopqrst";
String sub = str.substring(1, 3);
str = null;

這段簡單的程序有兩個字符串變量str、sub。sub字符串是由父字符串str截取得到的，假如上述這段程序在JDK1.6中運行，我們知道數組的內存空間分配是在堆上進行的，那麼sub和str的內部char數組value是公用了同一個，也就是上述有字符a~字符t組成的char數組，str和sub唯一的差別就是在數組中其實beginIndex和字符長度count的不同。在第三句，我們使str引用為空，本意是釋放str占用的空間，但是這個時候，GC是無法回收這個大的char數組的，因為還在被sub字符串內部引用著，雖然sub只截取這個大數組的一小部分。當str是一個非常大字符串的時候，這種浪費是非常明顯的，甚至會帶來性能問題，解決這個問題可以是通過以下的方法：

String str = "abcdefghijklmnopqrst";
String sub = str.substring(1, 3) + "";
str = null;

String str = "abcdefghijklmnopqrst";
String sub = str.substring(1, 3) + "";
str = null;

利用的就是字符串的拼接技術，它會創建一個新的字符串，這個新的字符串會使用一個新的內部char數組存儲自己實際需要的字符，這樣父數組的char數組就不會被其他引用，令str=null，在下一次GC回收的時候會回收整個str占用的空間。但是這樣書寫很明顯是不好看的，所以在JDK7中，substring 被重新實現了。

JDK7中的substring實現

在JDK7中改進了substring的實現，它實際是為截取的子字符串在堆中創建了一個新的char數組用於保存子字符串的字符。下面的這張圖說明了JDK7中substring的實現過程：

查看JDK7中String類的substring方法的實現源碼：

public String substring(int beginIndex, int endIndex) {
if (beginIndex < 0) {
throw new StringIndexOutOfBoundsException(beginIndex);
}
if (endIndex > value.length) {
throw new StringIndexOutOfBoundsException(endIndex);
}
int subLen = endIndex - beginIndex;
if (subLen < 0) {
throw new StringIndexOutOfBoundsException(subLen);
}
return ((beginIndex == 0) && (endIndex == value.length)) ? this
: new String(value, beginIndex, subLen);
}

public String substring(int beginIndex, int endIndex) {
        if (beginIndex < 0) {
            throw new StringIndexOutOfBoundsException(beginIndex);
        }
        if (endIndex > value.length) {
            throw new StringIndexOutOfBoundsException(endIndex);
        }
        int subLen = endIndex - beginIndex;
        if (subLen < 0) {
            throw new StringIndexOutOfBoundsException(subLen);
        }
        return ((beginIndex == 0) && (endIndex == value.length)) ? this
                : new String(value, beginIndex, subLen);
    }

public String(char value[], int offset, int count) {
if (offset < 0) {
throw new StringIndexOutOfBoundsException(offset);
}
if (count < 0) {
throw new StringIndexOutOfBoundsException(count);
}
// Note: offset or count might be near -1>>>1.
if (offset > value.length - count) {
throw new StringIndexOutOfBoundsException(offset + count);
}
this.value = Arrays.copyOfRange(value, offset, offset+count);
}

public String(char value[], int offset, int count) {
        if (offset < 0) {
            throw new StringIndexOutOfBoundsException(offset);
        }
        if (count < 0) {
            throw new StringIndexOutOfBoundsException(count);
        }
        // Note: offset or count might be near -1>>>1.
        if (offset > value.length - count) {
            throw new StringIndexOutOfBoundsException(offset + count);
        }
        this.value = Arrays.copyOfRange(value, offset, offset+count);
    }

Arrays類的copyOfRange方法：

public static char[] copyOfRange(char[] original, int from, int to) {
int newLength = to - from;
if (newLength < 0)
throw new IllegalArgumentException(from + " > " + to);
char[] copy = new char[newLength]; //是創建了一個新的char數組
System.arraycopy(original, from, copy, 0,
Math.min(original.length - from, newLength));
return copy;
}

public static char[] copyOfRange(char[] original, int from, int to) {
        int newLength = to - from;
        if (newLength < 0)
            throw new IllegalArgumentException(from + " > " + to);
        char[] copy = new char[newLength];   //是創建了一個新的char數組
        System.arraycopy(original, from, copy, 0,
                         Math.min(original.length - from, newLength));
        return copy;
    }

可以發現是去為子字符串創建了一個新的char數組去存儲子字符串中的字符。這樣子字符串和父字符串也就沒有什麼必然的聯系了，當父字符串的引用失效的時候，GC就會適時的回收父字符串占用的內存空間。

Java中介者設計模式 http://www.linuxidc.com/Linux/2014-07/104319.htm

Java 設計模式之模板方法開發中應用 http://www.linuxidc.com/Linux/2014-07/104318.htm

設計模式之 Java 中的單例模式（Singleton） http://www.linuxidc.com/Linux/2014-06/103542.htm

Java對象序列化 http://www.linuxidc.com/Linux/2014-10/107584.htm

大話設計模式(帶目錄完整版) PDF+源代碼 http://www.linuxidc.com/Linux/2014-08/105152.htm

上一篇文章： C程序內存管理
下一篇文章： Maven多module項目中千萬不要引入其他模塊的單元測試代碼

Linux編程

Java並發編程：Java內存模型

【Simple Java】Java內存洩露簡述

Java多線程編程——Java內存模型

C++程序內存洩漏檢測方法

Java 程序裡的內存洩漏

Java內存洩漏的定位與分析

注意Java代碼的內存洩漏

Java集合HashSet的hashcode方法引起的內存洩漏問題

相關文章

Java靜態泛型方法

Java泛型方法與橋方法

Java內存區域-“堆與棧”

Java內存模型-鎖

Linux C動態內存洩漏追蹤方法

Java重寫equals方法

Java重寫hashcode方法

Java內存區域

Java的join方法

深入Java核心 Java內存分配原理精講

Java 常見內存洩漏及其解決方案

如何識別Java中的內存洩漏

Linux編程

SHELL編程

PERL編程