grex (XML)

A CLI tool that makes XML greppable by flattening it into a line-oriented format — like gron, but for XML.

grex (not to be confused with the regex generator of the same name) is a CLI tool that makes XML data greppable. It transforms an XML document into a flat, line-oriented representation where each line is an XPath expression paired with a value — similar to what gron does for JSON.

The flattened format is fully reversible, which means you can flatten, manipulate with standard UNIX tools, and then reconstruct valid XML at the end.

Features

  • Flattens XML to line-oriented output — one XPath → value pair per line, easy to pipe into grep, sed, awk, diff, and friends
  • Reversiblegrex --ungrex reconstructs XML from the flattened format
  • Diffing — compare two XML documents meaningfully using standard diff
  • Merging — concatenate two flattened documents and ungrex to produce a merged XML file
  • Works with HTML too — handles real-world HTML documents, not just well-formed XML
  • Pairs with xpath-cli — the XPath paths grex emits can be used directly as inputs to xpath

Installation

cargo install --git https://github.com/jake-low/grex
# Debian / Fedora — install via cargo (see above), or build from source at https://github.com/jake-low/grex

Usage

Given a file menu.xml:

<pizzeria name="Panucci's Pizza">
  <menu>
    <pizza name="Cheese" price="14.99"/>
    <pizza name="Pepperoni" price="15.99"/>
  </menu>
</pizzeria>

Flatten it:

grex menu.xml

Output:

/pizzeria/@name = Panucci's Pizza
/pizzeria/menu/pizza[1]/@name = Cheese
/pizzeria/menu/pizza[1]/@price = 14.99
/pizzeria/menu/pizza[2]/@name = Pepperoni
/pizzeria/menu/pizza[2]/@price = 15.99

Now you can use standard UNIX tools on it:

# Find all prices
grex menu.xml | grep '@price'

# Raise all prices by 10% with awk
grex menu.xml \
  | awk -F' = ' '/@price/ {printf "%s = %.2f\n", $1, $2 * 1.10; next} {print}' \
  | grex --ungrex

# Diff two XML documents in a readable way
diff <(grex old.xml) <(grex new.xml)

# Read from stdin
curl -s https://example.com/feed.xml | grex

Reversing the transformation

# Flatten, modify, reconstruct
grex input.xml \
  | sed 's|/pizzeria/@name = .*|/pizzeria/@name = New Name|' \
  | grex --ungrex > output.xml

# Alias for convenience
alias ungrex="grex --ungrex"

See also

  • xpath-cli — evaluates XPath expressions directly on XML or HTML; the path expressions that grex emits are valid inputs to xpath
  • grex (regex generator) — a completely separate tool by a different author that generates regular expressions from example strings