View comments | RSS feed

InDesign CS3

TOPIC APPEARS IN:

Search using GREP expressions

On the GREP tab of the Find/Change dialog box, you can construct GREP expressions to find alphanumeric strings and patterns in long documents or many open documents. You can enter the GREP metacharacters manually or choose them from the Special Characters For Search list. GREP searches are case-sensitive by default.

  1. Choose Edit > Find/Change, and click the GREP tab.

  2. At the bottom of the dialog box, specify the range of your search from the Search menu, and click icons to include locked layers, master pages, footnotes, and other items in the search. (See Search options for finding and changing text.)

  3. In the Find What box, do any of the following to construct a GREP expression:

    • Enter the search expression manually. (See Metacharacters for searching.)

    • Click the Special Characters For Search icon to the right of the Find What option and choose options from the Locations, Repeat, Match, Modifiers, and Posix submenus to help construct the search expression.

  4. In the Change To box, type or paste the replacement text.

  5. Click Find.

  6. To continue searching, click Find Next, Change (to change the current occurrence), Change All (a message indicates the total number of changes), or Change/Find (to change the current occurrence and search for the next one.

Tips for constructing GREP searches

Here are some tips for constructing GREP expressions.

  • Many searches under the GREP tab are similar to those under the Text tab, but be aware that you need to insert different codes depending on which tab you’re using. In general, the Text tab metacharacters begin with a ^ (such as ^t for a tab) and GREP tab metacharacters begin with a \ (such as \t for a tab). However, not all metacharacters follow this rule. For example, a paragraph return is ^p in the Text tab and \r in the GREP tab. For a list of the metacharacters used for the Text and GREP tabs, see Metacharacters for searching.

  • To search for a character that has symbolic meaning in GREP, enter a backslash (\) before the character to indicate that the character that follows is literal. For example, a period ( . ) searches for any character in a GREP search; to search for an actual period, enter “\.”

  • Save the GREP search as a query if you intend to run it often or share it with someone else. (See Find/change using queries.)

  • Use parentheses to divide your search into subexpressions. For example, if you want to search for “cat” or “cot,” you can use the c(a|o)t string. Parentheses are especially useful to identify groupings. For example, searching for “the (cat) and the (dog)” identifies “cat” as Found Text 1 and “dog” as Found Text 2. You can use the Found Text expressions (such as $1 for Found Text 1) to change only part of the found text.

GREP search examples

Follow these examples to learn how to take advantage of GREP expressions.

Example 1: Finding text within quotation marks

Suppose you want to search for any word enclosed in quotation marks (such as “Spain”), and you want to remove the quotation marks and apply a style to the word (so that it becomes Spain instead of “Spain”). The expression (")(\W+)(") includes three groupings, as indicated by parentheses ( ). The first and third groupings search for any quotation mark, and the second grouping searches for one or more word characters.

You can use the Found Text expressions to refer to these groupings. For example, $0 refers to all found text, and $2 refers to only the second grouping. By inserting $2 in the Change To field and specifying a character style in the Change Format field, you can search for a word within quotations marks, and then replace the word with a character style. Because only $2 is specified, the $1 and $3 groupings are removed. (Specifying $0 or $1$2$3 in the Change To field would apply the character style to the quotation marks as well.)

GREP example

A.
Finds all word characters enclosed in quotation marks

B.
Change applies only to the second grouping

C.
Character style applied

This example searches only for single words enclosed in parentheses. If you want to search for phrases enclosed in parentheses, add wildcard expressions, such as (\s*.*\w*\d*), which looks for spaces, characters, word characters, and digits.

Example 2: Phone numbers

InDesign includes a number of search presets that you can choose from the Queries menu. For example, you can choose the Phone Number Conversion query, which looks like this:

\(?(\d\d\d)\)?[-. ]?(\d\d\d)[-. ]?(\d\d\d\d)

Phone numbers in the United States can appear in a variety of formats, such as 206-555-3982, (206) 555-3982, 206.555.3982, and 206 555 3982. This string looks for any of these variations. The first three digits (\d\d\d) of the phone number may or may not be enclosed in parentheses, so a question mark appears after the parentheses: \(? and \)?. Note that the backslash \ indicates that the actual parenthesis is being searched for and that it’s not part of a subexpression. The brackets [ ] locate any character within them, so in this case, [-. ] finds either a hyphen, a period, or a space. The question mark after the brackets indicate that the items within it are optional in the search. Finally, the digits are enclosed in parentheses, which signify groupings that can be referred to in the Change To field.

You can edit the grouping references in the Change To field to suit your needs. For example, could use these expressions:

206.555.3982 = $1.$2.$3

206-555-3982 = $1-$2-$3

(206) 555-3982 = ($1) $2-$3

206 555 3982 = $1 $2 $3

Additional GREP examples

Experiment with these examples to learn more about GREP searches.

Expression

Search string

Sample text

Matches (in bold)

Class of characters

[ ]

[abc] or [abc]

Finds the letter a, b, or c.

Maria cuenta bien.

Maria cuenta bien.

Beginning of paragraph

^

^~_.+

This searches the beginning of the paragraph (^) for an em dash (~_) followed by any character ( . ) one or more times (+).

“We saw—or at least we think we saw—a purple cow.”

—Konrad Yoes

“We saw—or at least we think we saw—a purple cow.”

—Konrad Yoes

Negative lookahead

(?!pattern)

InDesign (?!CS.*?)

The negative lookahead matches the search string only if it is not followed by the specified pattern.

InDesign, InDesign 2.0, InDesign CS, and InDesign CS2

InDesign, InDesign 2.0, InDesign CS, and InDesign CS2

Positive lookahead

(?=pattern)

InDesign (?=CS.*?)

The positive lookahead matches the search string only if it is followed by the specified pattern.

Use similar patterns for negative lookbehinds (?<!pattern) and positive lookbehinds (?<=pattern).

InDesign, InDesign 2.0, InDesign CS, and InDesign CS2

InDesign, InDesign 2.0, InDesign CS, and InDesign CS2

Groupings

( )

(quick) (brown) (fox)

The quick brown fox jumps up and down.

The quick brown fox jumps up and down.

All found text = quick brown fox; Found Text 1= quick; Found Text 2 = brown; Found Text 3= fox

Non-marking parentheses

(?:expression)

(quick) ($:brown) (fox)

The quick brown fox jumps up and down.

The quick brown fox jumps up and down.

All found text = quick brown fox; Found Text 1= quick; Found Text 2 = fox

Case-insensitive on

(?i)

(?i)apple

You can also use (?i:apple)

Apple apple APPLE

Apple apple APPLE

Case-insensitive off

(?-i)

(?-i)apple

Apple apple APPLE

Apple apple APPLE

Multiline on

(?m)

(?m)^\w+

In this example, the expression looks for one or more (+) word characters (\w) at the beginning of a line (^). The (?m) expression allows all lines within the found text to be treated as separate lines.

One Two Three Four Five Six Seven Eight

One Two ThreeFour Five SixSeven Eight

Multiline off

(?-m)

(?-m)^\w+

One Two Three Four Five Six Seven Eight

One Two Three Four Five Six Seven Eight

Single-line on

(?s)

(?s)c.a

The searches for any character ( . ) between the letters c and a. The (?s) expression matches any character, even if it falls on the next line.

abc abc abc abc

abc abc abc abc

Single-line off

(?-s)c.a

abc abc abc abc

abc abc abc abc

Ignore whitespace on

(?x)

(?s)\w \w\w

This searches for any word character (\w) followed by a space, followed by two more word characters (\w\w). The (?s) expression essentially ignores all whitespace so that it looks for three characters in a row (\w\w\w).

The quick brown fox

The quick brown fox

Ignore whitespace off

(?-x)

(?-s)\w \w\w

The quick brown fox

The quick brown fox

Repeat number of times

{ }

b{3} matches exactly 3 times

b(3,} matches at least 3 times

b{3,}? matches at least 3 times (shortest match)

b{2,3} matches at least 2 times and not more than 3

b{2,3}? matches at least 2 times and not more than 3 (shortest match)

abbc abbbc abbbbc abbbbbc

abbc abbbc abbbbc abbbbbc

abbc abbbc abbbbc abbbbbc

abbc abbbc abbbbc abbbbbc

abbc abbbc abbbbc abbbbbc

abbc abbbc abb bbc abb bbbc

Take a survey


Comments


Synaps1s said on Jul 14, 2007 at 6:55 AM :
For special characters like [ or ] add \ before the Character. The following example would look for any text between [ and ]
Just put this in Your Find / GREP Window:

(\[) *.*?\w*(\])

You can find a great Article about Regular Expressions along some Tutorials on Wikipedia: http://en.wikipedia.org/wiki/Regular_Expression
MercuryG said on Oct 15, 2007 at 7:22 AM :
the one important feature omitted from the GREP search article is how to
search for formatting within a string (such as how to replace a space
between a bold word followed by an italic word.)
Bob - Adobe Writer said on Oct 15, 2007 at 7:45 AM :
I don't think it's possible to search for formatting within a GREP search string. Someone please correct me if I'm wrong.
No screen name said on Oct 22, 2007 at 3:11 AM :
It also appears that it's not possible to use conditionals in the replace as
in some other programs.

Eg

Search for (a)(b)?
Replace with $1(?2:c)

Other programs:
ab -> ac
a -> a
abc -> acc
acb -> acb

InDesign:
ab -> a(?2:c) (so it's not recognising the conditional in the replace)
Bob - Adobe Writer said on Oct 24, 2007 at 3:43 PM :
Here are a few good GREP resources:

http://www.night-ray.com/regex.pdf (PDF download)

http://www.regular-expressions.info

http://www.indesignmag.com/idm/issues.html#old (download Issue #17)

Leave a comment if you're aware of other good GREP resources.
No screen name said on Oct 28, 2007 at 1:59 PM :
PDF Book on GREP and ID, $10:
http://safari.oreilly.com/9780596517069
Bob - Adobe Writer said on Oct 29, 2007 at 2:06 PM :
Yes, I just looked at this PDF. It's a 47-page document with detailed examples, troubleshooting, and quick reference information. Here's the contents:

Contents
GREP by Example ..................................... 2
The Basics ................................................... 4
Wildcards .................................................... 9
Locations ................................................... 14
Repeat: Sequences of Characters ........... 16
Referring to Wildcards: Back-Referencing ............................................... 20
Finding Formatted Text .......................... 22
Replacing with Wildcards ....................... 22
Splitting Up Complex Expressions ......... 28
Applying Styles with GREP .................... 31
Apply Formatting to Part of What You Find: Look Around .................................. 33
Look Behind You ..................................... 35
Look Around ............................................ 36
Advanced Techniques .............................. 39
Chaining Expressions .............................. 40
Leftovers ................................................... 43
Troubleshooting ....................................... 43
Resources .................................................. 45
Quick Reference ....................................... 45
Acknowledgments .................................... 47
Armadillo Graphics said on Nov 30, 2007 at 3:49 AM :
InDesign should have style option for consecutive upper case characters since acronyms for example stand out in body text. This is my workaround using GREP and Character style.

GREP
Find what: \u{2,}
Change to: empty
Find format: empty
Change format: Character Style: Caps

My Caps character style in body text is usually 93-95% size with optical kerning and +10 tracking.
blr_36 said on Apr 2, 2008 at 12:15 AM :
I think, there is an optimization bug in the regex engine. The

(?:(?<=,)\d+)|(?:\d+$)

expression found only the first digit(s) after a comma in the line, the rest of them and the number at the end of line never found. Of course, it found the digit(s) at the end of line, when there is no comma-then-digit sequence.
Bob - Adobe Writer said on Jul 14, 2008 at 6:53 AM :
Another good GREP resource here:

http://jetsetcom.net/index.php?
option=com_content&task=view&id=13&Itemid=1
GQID said on Jul 24, 2008 at 6:24 AM :
I have successfully isolated formats using GREP. It takes several steps. The
key to it is that some spaces are followed by a character of the same format,
and some are followed by one of different format. I search for space plus any
character of the same format, then substitute a placeholder character for the
space. Then search for the remaining spaces of the same format--those are
the ones followed by a different format--and substitute a different
placeholder character. Do the same for the 2nd format, and you end up with
formats that begin and end with unique combinations of placeholders. I’ve
only done this for separating 2 formats. And of course, the original text needs
to be clean and consistent

 

RSS feed | Send me an e-mail when comments are added to this page | Comment Report

Current page: http://livedocs.adobe.com/en_US/InDesign/5.0/WS1952D538-1335-4b1d-BA5E-FA5A176FDC9F.html