Added markdown manpage, which converts to roff using pandoc.

author: Bruce Hill <bruce@bruce-hill.com> 2021-05-19 22:02:45 -0700
committer: Bruce Hill <bruce@bruce-hill.com> 2021-05-19 22:02:45 -0700
commit: f824d3f3e2e3d3f1d441b94e18d7991ff523cef8 (patch)
tree: 99676777dd5550b4a0e6aa849f136769c0b6ff21 /bp.1.md
parent: 5d5817c2a3d21e91db4eaaf607ff881d71b57638 (diff)
1 files changed, 286 insertions, 0 deletions
diff --git a/bp.1.md b/bp.1.md
new file mode 100644
index 0000000..c58725c
--- /dev/null
+++ b/bp.1.md
@@ -0,0 +1,286 @@
+% BP(1)
+% Bruce Hill (*bruce@bruce-hill.com*)
+% May 17 2021
+
+# NAME
+
+bp - Bruce\'s Parsing Expression Grammar tool
+
+# SYNOPSIS
+
+**bp**
+\[*options...*\]
+*pattern*
+\[\[\--\] *files...*\]
+
+# DESCRIPTION
+
+**bp** is a tool that matches parsing expression grammars using a custom
+syntax.
+
+# OPTIONS
+
+**-v**, **\--verbose**
+: Print debugging information.
+
+**-e**, **\--explain**
+: Print a visual explanation of the matches.
+
+**-j**, **\--json**
+: Print a JSON list of the matches. (Pairs with **\--verbose** for more detail)
+
+**-l**, **\--list-files**
+: Print only the names of files containing matches instead of the matches
+themselves.
+
+**-i**, **\--ignore-case**
+: Perform pattern matching case-insensitively.
+
+**-I**, **\--inplace**
+: Perform filtering or replacement in-place (i.e. overwrite files with new
+content).
+
+**-C**, **\--confirm**
+: During in-place modification of a file, confirm before each modification.
+
+**-r**, **\--replace** *replacement*
+: Replace all occurrences of the main pattern with the given string.
+
+**-s**, **\--skip** *pattern*
+: While looking for matches, skip over *pattern* occurrences. This can be
+useful for behavior like **bp -s string** (avoiding matches inside string
+literals).
+
+**-g**, **\--grammar** *grammar-file*
+: Load the grammar from the given file. See the **GRAMMAR FILES** section
+for more info.
+
+**-G**, **\--git**
+: Use **git** to get a list of files. Remaining file arguments (if any) are
+passed to **git \--ls-files** instead of treated as literal files.
+
+**-c**, **\--context** *N*
+: The number of lines of context to print. If *N* is 0, print only the
+exact text of the matches. If *N* is **`"all"`**, print the entire file.
+Otherwise, if *N* is a positive integer, print the whole line on which
+matches occur, as well as the *N-1* lines before and after the match. The
+default value for this argument is **1** (print whole lines where matches
+occur).
+
+**-f**, **\--format** *auto*\|*fancy*\|*plain*
+: Set the output format. *fancy* includes colors and line numbers, *plain*
+includes neither, and *auto* (the default) uses *fancy* formatting only when
+the output is a TTY.
+
+**\--help**
+: Print the usage and exit.
+
+*pattern*
+: The main pattern for bp to match. By default, this pattern is a string
+pattern (see the **STRING PATTERNS** section below).
+
+*files...*
+: The input files to search. If no input files are provided and data was piped
+in, that data will be used instead. If neither are provided, **bp** will search
+through all files in the current directory and its subdirectories
+(recursively).
+
+
+# STRING PATTERNS
+
+One of the most common use cases for pattern matching tools is matching plain,
+literal strings, or strings that are primarily plain strings, with one or two
+patterns. **bp** is designed around this fact. The default mode for bp patterns
+is "string pattern mode". In string pattern mode, all characters are
+interpreted literally except for the backslash (**\\**), which may be followed
+by a bp pattern (see the **PATTERNS** section above). Optionally, the bp
+pattern may be terminated by a semicolon (**;**).
+
+
+# PATTERNS
+
+**bp** patterns are based off of a combination of Parsing Expression Grammars
+and regular expression syntax. The syntax is designed to map closely to verbal
+descriptions of the patterns, and prefix operators are preferred over suffix
+operators (as is common in regex syntax).
+
+Some patterns additionally have "multi-line" variants, which means that they
+include the newline character.
+
+*pat1 pat2*
+: A sequence: *pat1* followed by *pat2*
+
+*pat1* **/** *pat2*
+: A choice: *pat1*, or if it doesn\'t match, then *pat2*
+
+**.**
+: Any character (excluding newline)
+
+**\^**
+: Start of a line
+
+**\^\^**
+: Start of the text
+
+**\$**
+: End of a line (does not include newline character)
+
+**\$\$**
+: End of the text
+
+**\_**
+: Zero or more whitespace characters, including spaces and tabs, but not
+newlines.
+
+**\_\_**
+: Zero or more whitespace characters, including spaces, tabs, newlines, and
+comments. Comments are undefined by default, but may be defined by a separate
+grammar file. See the **GRAMMAR FILES** section for more info.
+
+**\"foo\"**, **\'foo\'**
+: The literal string **"foo"**. Single and double quotes are treated the same.
+Escape sequences are not allowed.
+
+**{foo}**
+: The literal string **"foo"** with word boundaries on either end. Escape
+sequences are not allowed.
+
+**\`***c*
+: The literal character *c* (e.g. **\`@** matches the "@" character)
+
+**\`***c1***,***c2*
+: The literal character *c1* or *c2* (e.g. **\`a,e,i,o,u**)
+
+**\`***c1***-***c2*
+: The character range *c1* to *c2* (e.g. **\`a-z**). Multiple ranges
+can be combined with a comma (e.g. **\`a-z,A-Z**).
+
+**\\***esc*
+: An escape sequence (e.g. **\\n**, **\\x1F**, **\\033**, etc.)
+
+**\\***esc1***-***esc2*
+: An escape sequence range from *esc1* to *esc2* (e.g. **\\x00-x1F**)
+
+**\\N**
+: A special case escape that matches a "nodent": one or more newlines followed
+by the same indentation that occurs on the current line.
+
+**!** *pat*
+: Not *pat*
+
+**\[** *pat* **\]**
+: Maybe *pat*
+
+*N* *pat*
+: Exactly *N* repetitions of *pat* (e.g. **5 \`x** matches **"xxxxx"**)
+
+*N* **-** *M* *pat*
+: Between *N* and *M* repetitions of *pat* (e.g. **2-3 \`x**
+matches **"xx"** or **"xxx"**)
+
+*N***+** *pat*
+: At least *N* or more repetitions of *pat* (e.g. **2+ \`x** matches
+**"xx"**, **"xxx"**, **"xxxx"**, etc.)
+
+**\*** *pat*
+: Some *pat*s (zero or more, e.g. **\* \`x** matches **""**, **"x"**,
+**"xx"**, etc.)
+
+**+** *pat*
+: At least one *pat*s (e.g. **\+ \`x** matches **"x"**, **"xx"**,
+**"xxx"**, etc.)
+
+*repeating-pat* **%** *sep*
+: *repeating-pat* separated by *sep* (e.g. **\*word % \`,** matches
+zero or more comma-separated words)
+
+**..** *pat*
+: Any text (except newlines) up to and including *pat*
+
+**.. %** *skip* *pat*
+: Any text (except newlines) up to and including *pat*, skipping over
+instances of *skip* (e.g. **\`\"..\`\" % (\`\\.)**)
+
+**\<** *pat*
+: Just after *pat* (lookbehind)
+
+**\>** *pat*
+: Just before *pat* (lookahead)
+
+**\@** *pat*
+: Capture *pat*
+
+**foo**
+: The named pattern whose name is **"foo"**. Pattern names come from definitions in
+grammar files or from named captures. Pattern names may contain dashes (**-**),
+but not underscores (**\_**), since the underscore is used to match whitespace.
+See the **GRAMMAR FILES** section for more info.
+
+**\@** *name* **=** *pat*
+: Let *name* equal *pat* (named capture). Named captures can be used as
+backreferences like so: **\@foo=word \`( foo \`)** (matches **"asdf(asdf)"** or
+**"baz(baz)"**, but not **"foo(baz)"**)
+
+*pat* **=\> \'***replacement***\'**
+: Replace *pat* with *replacement*. Note: *replacement* should be a
+string, and it may contain references to captured values: **\@0** (the whole of
+*pat*), **\@1** (the first capture in *pat*), **\@***foo* (the capture
+named *foo* in *pat*), etc. For example, **\@word \_ \@rest=(\*word % \_)
+=\> \"\@rest \@1\"**
+
+*pat1* **==** *pat2*
+: Matches *pat1*, if and only if *pat2* also matches the text of
+*pat1*\'s match. (e.g. **word == (\"foo\_\" \*.)** matches words that start
+with **"foo\_"**)
+
+*pat1* **!=** *pat2*
+: Matches *pat1*, if and only if *pat2* does not match the text of
+*pat1*\'s match. (e.g. **word == (\"foo\_\" \*.)** matches words that do
+not start with **"foo\_"**)
+
+*name***:** *pat*
+: Define *name* to mean *pat* (pattern definition)
+
+**\#** *comment*
+: A line comment
+
+
+# GRAMMAR FILES
+
+**bp** allows loading extra grammar files, which define patterns which may be
+used for matching. The **builtins** grammar file is loaded by default, and it
+defines a few useful general-purpose patterns. For example, it defines the
+**parens** rule, which matches pairs of matching parentheses, accounting for
+nested inner parentheses:
+
+```
+bp -p '"my_func" parens'
+```
+
+**bp** also comes with a few grammar files for common programming languages,
+which may be loaded on demand. These grammar files are not comprehensive syntax
+definitions, but only some common patterns. For example, the c++ grammar file
+contains definitions for **//**-style line comments as well as
+**/\*...\*/**-style block comments. Thus, you can find all comments with the
+string "TODO" with the following command:
+
+```
+bp -g c++ -p 'comment==(..%\n "TODO" ..%\n$$)' *.cpp
+```
+
+
+# EXAMPLES
+
+**ls \| bp foo**
+: Find files containing the string \"foo\" (a string pattern)
+
+**ls \| bp \'.c\\\$\' -r \'.h\'**
+: Find files ending with \".c\" and replace the extension with \".h\"
+
+**bp -p \'{foobar} parens\' my_file.py**
+: Find the literal string **\"foobar\"**, assuming it\'s a complete word,
+followed by a pair of matching parentheses in the file *my_file.py*
+
+**bp -g html -p 'html-element==(\"\<a \"..%\\n\$\$)' foo.html**
+: Using the *html* grammar, find all *html-element*s matching the tag *a* in
+the file *foo.html*
author	Bruce Hill <bruce@bruce-hill.com>	2021-05-19 22:02:45 -0700
committer	Bruce Hill <bruce@bruce-hill.com>	2021-05-19 22:02:45 -0700
commit	f824d3f3e2e3d3f1d441b94e18d7991ff523cef8 (patch)
tree	99676777dd5550b4a0e6aa849f136769c0b6ff21 /bp.1.md
parent	5d5817c2a3d21e91db4eaaf607ff881d71b57638 (diff)