xled → an introduction

xled: an introduction

The ved tutorial left you driving a line editor: a buffer of lines, an address, a command, a deliberate write. xled keeps every one of those ideas and adds a second dimension. ved edits lines; xled edits the cells of a table. Where ved is the verbose ed, xled is sed and awk with a live buffer and a coordinate system that matches how you already read a spreadsheet.

This page assumes you are fluent in sed and awk — or that you have just come from ved. It does not re-teach regular expressions, substitution, or what an address is. It teaches what is new in xled: the second axis, the algebra that composes the two axes into one address, and the small compute layer that rides on top. Every example below runs against a real file and shows real output.

From lines to cells

xled runs three ways, all sed-shaped. Give it a script and a file and it prints the result to stdout, ready to pipe. Pipe data in and it reads stdin. Give it a file and no script and it opens an interactive prompt — the ved heartbeat, now over a table.

$ xled '<script>' file.csv     # one-shot: run the script, print to stdout
… | xled '<script>'          # one-shot over piped stdin
$ xled file.csv                # interactive session on a file

A statement is address command, one per line — the same shape as ved. The difference is the buffer. ved's buffer is one-dimensional: lines, numbered top to bottom. xled's buffer is a grid of string cells, with rows numbered like ved's lines and columns lettered like a spreadsheet. So every address now has a row part, a column part, or both.

An address with no command prints those cells (ved's bare-address-prints rule), and a command with no address acts on the whole table. Here is the smallest useful thing — show one column of a products file:

$ xled '[price] show' products.csv
price
19.99
9.50
29.99
4.25
54.00
12.75
18.00
7.99

The cells are all strings. xled never guesses a type, never strips a leading zero, never rounds. That discipline is the spine of everything below.

The second axis: columns

A column is addressed two ways: by its spreadsheet letter, or by its header name in brackets. The two are interchangeable — C and [price] point at the same column in the file above.

Address	Selects
`C`	The column at letter C — past Z too: `AA`, `BC`, `CQ`
`[price]`	The column named `price` — exact, case-sensitive
`3`	Row 3 (a whole row, all columns)
`2:4`	Rows 2 through 4
`B2:C3`	The rectangle from B2 to C3
`[city / region (code)]`	A name with spaces, slashes, or parens — brackets quote it whole

The bracket is the one piece of syntax xled adds, and it earns its place. Real headers carry the characters that break a bare-token address: a space, a hyphen, parentheses, and — fatally — the / that is sed's own substitute delimiter. A bare name is unparseable in general; brackets resolve it, and they disambiguate the rest for free. The column at letter B is B; the column named B is [B]. Row 2024 is 2024; the header 2024 is [2024].

Names match exactly — [userId] is not [userid] — because a header is data, and silently folding its case is the same class of surprise as silently dropping a leading zero. When you want a loose match, you ask for one explicitly (the i flag on a regex, below).

Composing an address

Row and column parts combine through three operators — and they are not new syntax. They are Excel's own reference operators, including the one most people never learn they are using.

Operator	Means	Example
`:`	Range	`2:4`, `[a]:[d]`
(space)	Intersection	`[category] 2:4`
`,`	Union	`[name],[price]`
`!`	Negation	`!3` (every row but 3)

The space is Excel's implicit-intersection operator. A column name beside a row range means "the cells in both" — exactly what you want when you scope an edit to one column over some rows. Select the name column, intersected with the rows where category matches tools:

$ xled '[category]~/tools/ [name] show' products.csv
name
Widget Pro
Pro Hammer
Bench Vise

That [col]~/re/ form is awk's: a regex scoped to one column, selecting the rows where it matches. A bare /re/ matches against any cell in the row. And a single comparison is itself a row selector — no command word required beyond what you do with the rows:

$ xled 'num([price]) > 20 show' products.csv
name,category,price,sku
Pro Hammer,tools,29.99,TL-0099
Bench Vise,tools,54.00,TL-0153

What you cannot do is glue two conditions together with and / or. That wall is deliberate, and the error tells you where the capability lives:

$ xled 'num([qty])<num([reorder]) and [supplier]~/Contoso/ show' inventory.tsv
combining conditions with and/or is not in xled's scope: an address selects
rows to edit, it is not a query. For one more condition, run a second xled
command on the result; for a real predicate, query first — xql 'SELECT *
WHERE …' file.csv | xled '…'.

An address picks cells to edit; it is not a WHERE clause. Multi-predicate filtering is a query, and a query is xql's job. xled refuses cleanly rather than growing a second, worse query language.

Substitution, scoped to cells

This is the part you already own. s/pattern/replacement/flags is sed's, with the same flags (g, i, an occurrence number) and the same replacement dialect: \1–\9 for captures, & for the whole match, and the case-folding escapes \U \L \u \l \E. The only change is the domain: instead of running over lines, it runs over the cells your address selected, one cell at a time.

The most common real task in a spreadsheet export — strip the currency formatting glued to a money column — is one substitution scoped to one column:

$ xled '[annual_cost] s/[$,]//g' app-portfolio.csv
app_name,owner,annual_cost,status,last_reviewed,criticality
SAP ERP,Finance Dept,1250000,Active,2024-03-15,High
Legacy CRM,j.smith@corp.com,89500,active,03/15/2024,high
TimeTracker,IT Operations ,12000,IN USE,2024-01-02,Medium

Every other cell round-trips byte-for-byte: quoted fields, embedded commas, and leading zeros are untouched because the csv layer parses them properly and only the addressed cells are rewritten. Case-folding works as in sed — uppercase a column by matching the whole cell and folding the match:

$ xled '[criticality] s/.*/\U&/' app-portfolio.csv
… ,Active,2024-03-15,HIGH
… ,active,03/15/2024,HIGH
… ,IN USE,2024-01-02,MEDIUM

Captures rearrange exactly as you would expect — turn First Last into Last, First, and note that xled re-quotes the comma it just introduced:

$ xled '[name] s/(\S+) (\S+)/\2, \1/' contacts.csv
name,email,phone,company
"Nguyen, Alice",Alice.Nguyen@EXAMPLE.com,(555) 123-4567,Acme Corp
"O'Brien, Bob",bob@widgets.io,555.234.5678,Widgets Inc
"DIAZ, CAROL",carol_diaz@MAIL.COM,+1 555 345 6789,

Two differences from line-oriented sed are worth holding in mind. Anchors are cell-bounded: ^ and $ mean the start and end of the cell, not of a line. And a bare /re/ in address position is an any-cell selector across the table, where in sed it would be a line address. Both follow from the buffer being a grid rather than a stream.

The compute layer

Where s/// rewrites the characters of a cell, = expr computes a value — the awk half of xled. It writes into exactly one column, creating it if the name is new. Values are one of three types (string, number, bool), and there is no automatic coercion. Arithmetic requires numbers; you cast with num(). That single rule is why 00042 stays 00042 until you decide otherwise.

$ xled '[total] = round(num([price]) * 1.0825, 2)' products.csv
name,category,price,sku,total
Widget Pro,tools,19.99,TL-0042,21.64
Gadget,gizmos,9.50,GZ-0101,10.28
Pro Hammer,tools,29.99,TL-0099,32.46

The sku column — an identifier that looks numeric in places — is never touched, because nothing asked it to be a number. And the currency is rounded only because round(…, 2) said so. Numbers serialize at full f64 precision, so any money column must be wrapped in round; xled will not round on write, because inventing precision you did not ask for is the same betrayal as silent coercion. A cast that fails is non-halting — the cell is left as it was, and a tally reports how many were skipped.

The function library is small and derived from real work, not invented:

Function	Does
`num(x)`, `bool(x)`	Explicit cast; a failure leaves the cell and tallies
`len(x)`	Character length
`left(x,n)`, `right(x,n)`, `mid(x,s,n)`	Substrings (Excel spelling)
`substr(x,s[,n])`	awk substring; the 2-arg form runs to the end
`round(x,d)`	Round to `d` decimals
`default(x,fb)`, `coalesce(a,b,…)`	Fill blanks
`if(cond,a,b)`	A conditional expression, not control flow

One awk habit to unlearn: comparisons are string-wise unless you cast. [qty] < [reorder] compares the literal strings, so "9" > "10" lexically — which is not numeric order. Cast both sides for numbers. The payoff is that xled never silently numifies the way awk does, which is exactly the surprise the stringly model exists to prevent.

$ xled '[low] = num([qty]) < num([reorder])' inventory.tsv
sku	location	qty	reorder	supplier	low
TL-0042	A1-03	120	50	Northwind	false
GZ-0101	B2-11	8	25	Contoso	true
EL-0007	A3-07	0	10	Fabrikam	true

And because a comparison is also a row selector, the same predicate can scope an edit instead of producing a column — flag the low-stock rows in place:

$ xled 'num([qty]) < num([reorder]) [supplier] = "REORDER"' inventory.tsv
TL-0042	A1-03	120	50	Northwind
GZ-0101	B2-11	8	25	REORDER
EL-0007	A3-07	0	10	REORDER
TL-0099	C1-02	45	30	Northwind

Case-folding and trimming are not in the function library on purpose — they are pattern rewrites of text, so they live in s///. The line between the two layers stays sharp: s/// rewrites the characters of a cell; = expr produces a value.

Taming a real mess

Here is what sed and awk never gave you. Spreadsheets arrive structurally damaged: a header buried under a title block, blank spacer rows and columns, merged cells that left holes, two tables stacked in one sheet. xled meets these with a handful of addressing verbs — no detection magic, no reshaping.

Verb	Does
`crop`	Reduce the buffer to one rectangle — carve a table out of junk
`header N`	Promote row N to the column-name header
`rename newname`	Rename a header in place (rest of line, no quoting)
`fill` / `fill down`	Fill blank cells from the value above (merged-cell artifacts)
`drop blanks [rows\|cols]`	Trim empty edge rows and columns
`describe`	Advisory region report — never mutates

Start with describe. It reads the shape and tells you what it suspects, and it never changes anything. Point it at a risk log whose real header sits on row 5 under a three-line title block:

$ xled --no-header 'describe' risk-log.csv
10 rows × 5 cols. leading blank rows: 0. trailing blank rows: 2.
suspected header row: 5 (narrower preamble above — try `5 header`).
(advisory — turn this into crop/header/del yourself.)

The --no-header flag is the key move for a buried header: it tells xled not to adopt row 1 as the header, so row numbers line up with the file and you can address the real header row directly. Then it is two verbs — crop to the table, promote its header:

$ xled --no-header '5:8 crop
1 header
show' risk-log.csv
ID,Application,Risk,Likelihood,Impact
R1,Acme CRM,Vendor end-of-life,High,High
R2,Globex Billing,Data migration gap,Medium,High
R3,Initech HR,License overage,Low,Medium

Crop carves a rectangle — one working table. A file with two stacked tables is two passes, not one clever command, because xled is not a splitter. That restraint is the same one that keeps it from reshaping: it will trim blank edges and promote a header, but it will not unpivot, split a cell into columns, or merge stacked tables. Those losses happen upstream or belong to a different tool, and xled says so plainly rather than guessing.

Preview, undo, deliberate save

Open a file with no script and you get the ed-lineage session, made interactive. It previews every edit before committing, keeps an undo stack, and writes only when you say so — the same deliberate-save instinct ved drilled into you, which matters more when an edit can touch a whole column at once.

$ xled app-portfolio.csv
xled> preview [annual_cost] s/[$,]//g     # show the effect, commit nothing
xled> [annual_cost] s/[$,]//g             # run it for real
xled> undo                                # take it back
reverted last change
xled> write                               # save to the source file
wrote 7 rows to app-portfolio.csv
xled> quit

The commands are plain words — preview, undo, write, help, quit — not single-letter escapes. A one-shot script and the same lines typed into the session produce identical results; the interactive loop is a way to see each step, not a different engine. In a pipe, the data goes to stdout and the advisory notices go to stderr, so xled … file.csv > out.csv stays clean.

Where xled stops

A sed/awk reflex will reach for sort, for a group-by, for a join. xled refuses all of them, and the refusal is a feature: each one names the tool that has it. Sorting, grouping, aggregating, and joining are SQL operations, and they go to xql or DuckDB. Splitting one cell into several columns, collapsing a multi-row header, unpivoting — those reshape the table, and xled never changes the table's shape beyond appending a computed column.

What remains is a sharp tool that does one thing completely: address part of a table and rewrite it, faithfully, with the result in front of you before you commit. The pieces compose down a pipe the way Unix intends:

duckdb -csv … | xled '[total] = round(num([price]) * [qty], 2)' | xql 'SELECT …'

If you know sed and awk, you now know xled. The second axis and the compute layer are the only genuinely new ideas; everything else is muscle memory you already had.

Ready to try it?

xled is a single binary for Linux, macOS, and Windows, MIT-licensed and written in pure Rust. The xled page has the install steps, and the full source, grammar reference, and design notes live at github.com/excelano/xled.

Install xled