Samiksha Jaiswal (Editor)

Brace notation

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

In several programming languages, such as Perl, brace notation is a faster way to extract bytes from a string variable.

Contents

In pseudocode

An example of brace notation using pseudocode which would extract the 82nd character from the string is:

The equivalent of this using a hypothetical function 'MID' is:

In C

In C, strings are normally represented as a character array rather than an actual string data type. The fact a string is really an array of characters means that referring to a string would mean referring to the first element in an array. Hence in C, the following is a legitimate example of brace notation:

Note that each of a_string[n] would have a 'char' data type while a_string itself would return a pointer to the first element in the a_string character array.

In C#

C# handles brace notation differently. A string is a primitive type that returns a char when encountered with brace notation:

To change the char type to a string in C#, use the method ToString(). This allows joining individual characters with the addition symbol + which acts as a concatenation symbol when dealing with strings.

In Python

In Python, strings are immutable, so it's hard to modify an existing string, but luckily it's easy to extract and concatenate strings to each other: Extracting characters is even easier:

Python is flexible when it comes to details, note var[-1] takes -1 as the index number. That index is interpreted as the first character beginning from the end of the string. Consider 0 as the index boundary for a string; zero is inclusive, hence it will return the first character. At index 1 and above, all characters belonging to each index are 'extracted' from left to right. At index -1 and below, all characters are 'extracted' from right to left. Since there are no more characters before index 0, Python "redirects" the cursor to the end of the string where characters are read right to left. If a string has length n, then the maximum index boundary is n-1 and the minimum index boundary is -n which returns the same character as index 0, namely the first character.

It is also possible to extract a sequence of characters:

Notice that the last number in the sequence is exclusive. Python extracts characters beginning at index 0 up to and excluding 5.

One can also extract every x character in the sequence, in this case x=2:

In PHP

PHP strings can grow very large and can use all available memory, if a large enough string occurs. Usually, if that's the case, it may be better to split() a string into an array for finer control. Brace notation in PHP looks like:

Note that variable $a accepts characters inside a double quote or single quote as the same string. PHP expects the string to end with the same quotation mark as the opening quote(s). Brace notation on a string always returns a string type.

In JavaScript

JavaScript brace notation works the same as in C# and PHP.

Some browsers like Firefox support added features of this scripting language:

At index -1, Firefox returns the length of the string. This however, does not currently work for Internet explorer (IE).

In MATLAB

MATLAB handles brace notation slightly differently from most common programming languages.

Strings begin with index 1 enclosed in parenthesis, since they are treated as matrices. A useful trait of brace notation in MATLAB is that it supports an index range, much like Python:

The use of square brackets [ ] is reserved for creating matrices in MATLAB.

Drawbacks

While this notation is much faster, it is also more dangerous because it does not perform any checks on string length, and therefore its return cannot be accurately predicted.

In particular, using brace notation without making sure it is within the limits of the string can have serious consequences for a program since if a character is read that is past the string terminator, the program will be reading memory that's not allocated to that string. This memory could be another string, a pointer, or something completely different such as unallocated space. Writing to this space in memory would cause unpredictable results, and will probably end in a SIGSEGV (segmentation fault).

References

Brace notation Wikipedia