Word Boundary
Regular Expressions: Word Boundary
What does a word boundary in regex denote in JavaScript?
View Answer:
How is a word boundary represented in JavaScript's regular expressions?
View Answer:
let str = "Hello, welcome to HelloJavaScript. HelloJavaScript is great!";
let regex = /\bHelloJavaScript\b/g;
let matches = str.match(regex);
console.log(matches); // prints: [ 'HelloJavaScript', 'HelloJavaScript' ]
In this example, the regular expression /\bHelloJavaScript\b/g
matches the word "HelloJavaScript" where it appears as a whole word (not part of another word). The g
at the end of the regular expression indicates a global search (find all matches rather than stopping after the first match).
If we didn't use the word boundary \b
, we would also match words that contain "HelloJavaScript" as a substring. For example:
let str = "Hello, welcome to HelloJavaScript. HelloJavaScript is great!";
let regex = /HelloJavaScript/g;
let matches = str.match(regex);
console.log(matches); // prints: [ 'HelloJavaScript', 'HelloJavaScript' ]
Here, the regular expression /HelloJavaScript/g
matches both "HelloJavaScript" and "HelloJavaScript" because we didn't specify the word boundary.
What is the opposite of a word boundary in regex?
View Answer:
let str = "Hello, welcome to HelloJavaScript. HelloJavaScriptProgramming is great!";
let regex = /\BHelloJavaScript\B/g;
let matches = str.match(regex);
console.log(matches); // prints: [ 'HelloJavaScript' ]
In this example, the regular expression /\BHelloJavaScript\B/g
matches "HelloJavaScript" only when it's part of another word, like "HelloJavaScriptProgramming". The g
at the end of the regular expression indicates a global search (find all matches rather than stopping after the first match).
If the string was "Hello, welcome to HelloJavaScript. HelloJavaScript is great!", the same regex would not match "HelloJavaScript" because in this case, "HelloJavaScript" is not part of another word, it stands alone:
let str = "Hello, welcome to HelloJavaScript. HelloJavaScript is great!";
let regex = /\BHelloJavaScript\B/g;
let matches = str.match(regex);
console.log(matches); // prints: null
In this case, there is no match, so the match
function returns null
.
What characters does JavaScript consider as word characters for the '\b' meta-character?
View Answer:
let str = "Hello, world! This is sample text with_123 some word boundaries.";
let regex = /\b\w+\b/g;
let matches = str.match(regex);
console.log(matches);
// prints: "Hello", "world", "This", "is", "sample", "text", "with_123", "some", "word", "boundaries"]
Note that characters other than alphanumeric characters and the underscore are considered non-word characters by \b. Therefore, if a word is followed or preceded by any non-word character, it is considered a word boundary.
let str = "Hello, world! This is a sample-text with_123 some word-boundaries.";
let regex = /\b\w+\b/g;
let matches = str.match(regex);
console.log(matches); // prints: [ 'Hello', 'world', 'This', 'is', 'a', 'sample', 'text', 'with_123', 'some', 'word', 'boundaries' ]
In this case, the words "sample-text" and "word-boundaries" are treated as separate words because they are separated by non-word characters (-
and -
, respectively).
Can you explain why 'at' doesn't match 'cat' when using '\bat' in JavaScript Regex?
View Answer:
let str = "cat in the hat";
let regex = /\bat\b/g;
let matches = str.match(regex);
console.log(matches); // output: null
How does '\b' behave differently at the start and end of a string in JavaScript Regex?
View Answer:
// Using '\b' at the start of a string
const regexStart = /\bfoo/;
console.log(regexStart.test('foo bar')); // Output: true
console.log(regexStart.test('foobar')); // Output: true
// Using '\b' at the end of a string
const regexEnd = /bar\b/;
console.log(regexEnd.test('foo bar')); // Output: true
console.log(regexEnd.test('barfoo')); // Output: false
Can a word boundary match a position between two non-word characters in JavaScript Regex?
View Answer:
const regex = /\bfoo\b/;
console.log(regex.test('foo')); // Output: true
console.log(regex.test('foo bar')); // Output: true
console.log(regex.test('foobar')); // Output: false
console.log(regex.test('foo_bar')); // Output: false
console.log(regex.test('foo123')); // Output: false
console.log(regex.test('123 foo 456')); // Output: true
Why do we use word boundaries in JavaScript regular expressions?
View Answer:
What will '\b' match in the string "#apple#" in JavaScript regex?
View Answer:
let str = '#apple#'
const regex = /\bapple\b/;
const match = str.match(regex);
console.log(match); // output: ["apple"]
console.log(regex.test('#apple#')); // Output: true
Can Regex word boundaries consume characters in a string?
View Answer:
How would a word boundary handle punctuations in JavaScript regex?
View Answer:
const regex = /\bfoo\b/;
console.log(regex.test('foo!bar')); // Output: true
console.log(regex.test('foo!')); // Output: true
console.log(regex.test('bar?foo')); // Output: true
console.log(regex.test('bar.foo')); // Output: true
////////////////////////////////////
let str = 'bar?foo'
const regex = /\bfoo\b/;
const match = str.match(regex)
console.log(match); // ["foo"]
What happens if you apply a global search with '\b' in JavaScript Regex?
View Answer:
let text = "I like apple. I love to eat an apple. The apple is red.";
let regex = /\bapple\b/g;
let result = text.match(regex);
console.log(result); // This will output: [ 'apple', 'apple', 'apple' ]
How does the word boundary regex behave in the case of consecutive word characters in JavaScript?
View Answer:
let text = "apple123 orange4567 banana89";
let regex = /\b/g;
let result = text.split(regex);
console.log(result); // This will output: [ 'apple123', ' ', 'orange4567', ' ', 'banana89' ]
Can you use a word boundary to match a space character in JavaScript Regex?
View Answer:
How can you combine word boundaries with other regex elements in JavaScript?
View Answer:
let text = "cat, concatenate, cataract";
let regex = /\bcat\b/g;
let result = text.match(regex);
console.log(result); // Outputs: ['cat']
In this code, \bcat\b
only matches the standalone word "cat", not "cat" in "concatenate" or "cataract".
How would you use a word boundary to match 'end' but not 'ending' in a JavaScript Regex?
View Answer:
let text = "end ending bend";
let regex = /\bend\b/g;
let result = text.match(regex);
console.log(result); // Outputs: ['end']
In this code, \bend\b
only matches the standalone word "end", not "end" in "ending" or "bend".
How can word boundaries help in validating user input in JavaScript?
View Answer:
Here is a code snippet where word boundaries are used to validate user input for a specific username format (only allows alphanumeric characters and underscores).
function validateUsername(username) {
let regex = /^\b\w+\b$/g;
return regex.test(username);
}
console.log(validateUsername('username_1')); // Outputs: true
console.log(validateUsername('username@1')); // Outputs: false
Here, the regex ^\b\w+\b$
checks that the entire username (^...$
) consists of one or more word characters (\w+
) enclosed by word boundaries (\b
). This ensures the username doesn't contain invalid characters.
Does '\b' meta-character match the beginning or end of a line in JavaScript Regex?
View Answer:
Can you use word boundaries to replace specific words in a string in JavaScript?
View Answer:
let text = "I love apples. I love to eat apples.";
let regex = /\bapples\b/g;
let newText = text.replace(regex, 'oranges');
console.log(newText); // Outputs: "I love oranges. I love to eat oranges."
In the above code, the replace()
method uses the regex with word boundaries (\b
) to replace all standalone instances of "apple" with "orange".
Can you use word boundaries in a character set in JavaScript regex?
View Answer:
What is a word boundary \b in regular expressions (regexp)?
View Answer:
console.log('Hello, Java!'.match(/\bJava\b/)); // Java
console.log('Hello, JavaScript!'.match(/\bJava\b/)); // null
// More Examples
console.log('Hello, Java!'.match(/\bHello\b/)); // Hello
console.log('Hello, Java!'.match(/\bJava\b/)); // Java
console.log('Hello, Java!'.match(/\bHell\b/)); // null (no match)
console.log('Hello, Java!'.match(/\bJava!\b/)); // null (no match)
// Digit Boundaries
console.log('1 23 456 78'.match(/\b\d\d\b/g)); // returns 23,78
console.log('12,34,56'.match(/\b\d\d\b/g)); // returns 12,34,56
Does a word boundary work on Non-Latin alphabets?
View Answer:
Here's an example with Cyrillic characters:
let text = "яблоко груша банан";
let regex = /\bяблоко\b/g;
let result = text.match(regex);
console.log(result); // Outputs: null
In this code, we're trying to match the word for "apple" in Russian ("яблоко"). However, the output is null
, indicating no matches, because \b
doesn't recognize the Cyrillic characters as word characters.
While \b
is useful for many scenarios with English and other languages using the Latin alphabet, for non-Latin alphabets, you may need to use different approaches or libraries that support Unicode word boundaries.