Java Interview Questions About Strings

What are Strings in Java?

Strings represent a sequence of characters. In Java, String is not a primitive data type like int or long, they are objects and the characters are stored internally as a char array.

The String class is used to represent strings and provides some methods that can be used to manipulate them and perform various operations.

How can you create Strings in Java?

The most common way to create a String is to use a literal like so:

String myString = "Hello World";

You will use this 99% of the time, but there can be cases when you would like to use one of the constructors of the String class (like when creating a String from a character array).

What is important to remember here is not to use this constructor:

new String("Hello World")

This will first create a new String from the literal that you specified (“Hello World”), it will then pass this to the constructor and that will create another String object with the same value. You have 2 objects instead of one, this is really inefficient.

How to check if two Strings are equal?

There are two ways to check for String equality.

The first way is to use the double equal sign (==). In most cases however this will not provide the expected result, because it will compare the references of the two Strings and those might not be the same. Two different String objects could exist with the same value, but their reference will not be the same.

The recommended way is to use the equals() method. It will work as expected because it will only compare the value of the two String objects.

Why there are classes like StringBuilder or StringBuffer?

Performing a lot of operations (like concatenation) on a String until you reach it’s final form can result in creating a lot of String objects, because each of these operations have the chance of creating a new object because of String’s immutability.

StringBuilder and StringBuffer helps in this situation by keeping track of the String you are building as a character array. It will only produce a String object when you are done with the building of the String and ask StringBuilder/StringBuffer to return the result.

Let’s see an example. Without StringBuilder:

String stringByConcatenation = "Hello";

stringByConcatenation += " ";
stringByConcatenation += "World";
stringByConcatenation += "!";

After executing these lines we will have the following String objects:

  • 4 String literals we see in the code.
  • 3 intermediate results during the concatenation.

With StringBuilder:

StringBuilder builder = new StringBuilder("Hello");
builder.append(" ");
builder.append("World");
builder.append("!");

String stringFromBuilder = builder.toString();

After executing these lines we will have the following String objects:

  • 4 String literals we see in the code.
  • 1 extra String when we call toString().

As you can see in this example we have 5 instead of 7 strings. In more complex examples, the win would be even higher. For 1000 strings, it would be 1999 (without StringBuilder) vs 1001 (with StringBuilder).

What is the difference between StringBuilder and StringBuffer?

StringBuffer is synchronized, StringBuilder is not.

What does it mean that a String is immutable?

It means that once a String object is created, it’s value cannot be changed. More on that here: What is the difference between final and immutable in Java?

Are Strings thread safe?

Strings are thread safe, because they are immutable so their state cannot change after they are created.

What is String interning?

See our separate article on this topic: What is String interning in Java?

Bonus: Why could String’s substring() method cause a memory leak in JDK6?

In JDK6 the String class contained three fields: value, offset, count.

  • value – The characters of the String as a char array.
  • offset – The first index of the array.
  • count – The number of characters in the String.

When the substring() method was called, it just created a new String object that contained the same value, but used a different offset and count. As a result the new object referenced the old object and made it impossible for it to be garbage collected. If you had a lot of huge Strings, this wasted memory could add up to large amounts.

Here is the String constructor that causes the mentioned problem in JDK6.

String(int offset, int count, char value[]) {
    this.value = value;
    this.offset = offset;
    this.count = count;
}

In JDK7, this issue has been resolved. The String class no longer has the offset and count fields. When the substring() method is used, a new String is created by copying the required characters to a new object.

What is the Difference Between Final and Immutable in Java?

Final

If you declare a field or variable final, it means, that you cannot change the object reference stored in it. It will always point to the same object. While you cannot substitute the stored reference with another one, you can modify the referenced object (for example update its fields).

For classes, final means that you cannot create a subclass of it.

Making something final is just a matter of adding a keyword, reaching immutability is a bit more complex.

Immutable

If an object is immutable, it’s state/value cannot change over time. A good example for this is the String or the BigDecimal class.

BigDecimal for example, has a number of “manipulation” methods like add(), but these methods will not modify the original object, but they will return a new one.

public BigInteger add(BigInteger val) {
        // ...
        return new BigInteger(resultMag, cmp == signum ? 1 : -1);
    }

Making an object immutable is the responsibility of the programmer. It cannot be achieved just by putting there a keyword like in case of final.

Final and immutable

Of course, it is possible for an object to be final and immutable at the same time. A good example for this is the String class.

What is String interning in Java?

What does interning mean?

In a Java application, you can create a lot of String instances with the same content. Without String interning all of these would occupy separate memory areas as separate objects. If there are lots of them, this could mean a significant amount of memory.

String interning in Java is basically the way of storing only one copy of each distinct String value. As you will see, in a lot of cases it happens automatically, but in some cases, you have to do it manually by calling the String.intern() method.

How does it work?

Because Strings can be the same often, a so called String Pool is implemented in Java. The String Pool lives in the Heap memory and it stores each of the interned Strings in your application. The String Pool is privately maintained in the String object.

When you create a new String and intern it, Java checks if this String is already present in the String Pool. If it is there, a reference to that String object is returned and no new object is created. If it is not there, a new String object is created and stored in the String Pool and a reference to the new object is returned.

When should I use interning?

Automatic interning

As I mentioned, there are cases when interning happens automatically. Let me quote from the JavaDoc of the intern() method:

“All literal strings and string-valued constant expressions are interned.”

So the following two Strings will be interned automatically (first is a literal, second is a string-valued constant expression):

String a = "John";
String b = "Jane" + "Doe";

Manual interning

Other than the above-mentioned cases, interning does not happen automatically. Let’s take an example when you create two String variables and concatenate them into new variables:

String a = "John";
String a2 = "John";
String b = "Doe";
String c1 = a + b;
String c2 = a + b;

In this case, the “John” and “Doe” Strings are interned (stored in the String Pool), so only one object containing “John” and one with “Doe” is created.

However, the “JohnDoe” String is not interned and as a result, not stored in the String Pool. Because of this, the above code will create two String objects that contain the “JohnDoe” String.

To use interning manually, you can call the intern method on the resulting String like this:

String c1 = (a + b).intern();
String c2 = (a + b).intern();

This ensures that only one String object will be created with the content of “JohnDoe” and it will be stored in the String Pool.

Memory savings by interning

As I described above, interning allows you to reuse the same String objects multiple times. Of course, this results in less memory usage in the Heap space, where objects are stored.

The savings depend on your application. In small apps, you might not notice the difference, but in case of a lot of Strings the memory gain can be significant.

Example

I have put together a small test application to demonstrate that:

package com.jtuts;

import java.util.ArrayList;
import java.util.List;
import java.util.stream.Collectors;
import java.util.stream.IntStream;

public class Main {

    public static void main(String[] args) throws InterruptedException {

        Thread.sleep(10000);
        List<List<String>> listOfLists = new ArrayList<>();

        IntStream.range(0, 10).forEach(i -> {
            try {

                System.out.println("Starting iteration " + i);

                List<String> list = IntStream.range(0, 100000).mapToObj(j -> {
                    String s1 = "ABCABCABCABCABCABCABCABCABCABCABCABCABC";
                    String s2 = "123123123123123123123123123123123123123";
                    return s1 + s2;
                }).collect(Collectors.toList());

                listOfLists.add(list);

                Thread.sleep(5000);
            } catch (InterruptedException e) {}
        });
    }

}

This application will attempt to create 10 x 100 000 quite long String objects. As you can see, I am creating two Strings via literals (these will be interned automatically), but then I concatenate them, creating a brand new String that will not be automatically interned.

I am collecting all these Strings in a List, so the Garbage Collector won’t remove any of them from the String Pool until the end of my application. I also added some delays so the memory allocation changes are more visible.

I started the application and also started up JConsole, and this is what I saw:

As you can see, my application is using more than 200 MB of Heap memory. That’s a lot. It is because we have created 1 million pretty big Strings and did not intern them.

Now let’s see what happens if I change the row where I return s1 + s2 to the following:

(s1 + s2).intern()

If I run the application again, now I get the following chart:

As you can see, the Heap memory usage doesn’t even reach 30 MB, which is way less than the first version.

The takeaway from this example is that if you have a large enough application that is for some reason producing a lot of Strings with the same content, then using interning can save you a lot of memory.

The effect of interning on comparison performance

As you probably know, when you want to compare two Strings by content you have to use the equals() method of the String object. This goes through the Strings and compares them character by character. In case of a lot of comparisons, this could take a considerable amount of time.

But wait! There is another way of comparing two objects. By using the double equals operator (==). However, the problem is, that we cannot use this for String content comparison as they only compare the reference of the objects, which might not be the same, even for Strings with the same content.

This example shows it to you very well:

String s1 = "John";
String s2 = "Doe";
String s3 = "JohnDoe";
String s4 = s1 + s2;

System.out.println(s3.equals(s4));
System.out.println(s3 == s4);
true
false

But what if we intern the Strings first and then do the double equals comparison? Exactly! That will work, because as I mentioned earlier, by interning you make sure that two Strings with the same content will refer to the same object in the String Pool.

String s1 = "John";
String s2 = "Doe";
String s3 = "JohnDoe";
String s4 = (s1 + s2).intern();

System.out.println(s3.equals(s4));
System.out.println(s3 == s4);
true
true

As you can see, this works and comparing two references if much faster than comparing two Strings character by character.

So, should I use interning all the time?

No, definitely not.

First of all, not all applications would benefit considerably from interning. For a noticeable benefit, the application has to handle a lot of large String objects that have the same content. I think that is rarely the case.

Secondly, using interning means a lot of overhead. You have to add the intern() method call to a lot of places and if you plan to use the double equals on your Strings for performance reasons, you have to be absolutely sure that intern() is used everywhere, where it is necessary. Because if you miss some occurrences, your comparisons could give you wrong results.

All in all, I think that by default interning should be skipped completely. However, if you suspect that your application could benefit considerably from using it, it is worth to check out the possible benefits.