Skip to main content
Ben Nadel at CFCamp 2023 (Freising, Germany) with: Michael Wiesmeyr and Cynthia Ramos and Raphael Mürwald and Marcel Mayr and Joseph Pernerstorfer and Daniel Pötscher and Andreas Breit and Luccas Hansch and Roman Schick and Christian Immitzer
Ben Nadel at CFCamp 2023 (Freising, Germany) with: Michael Wiesmeyr Cynthia Ramos Raphael Mürwald Marcel Mayr Joseph Pernerstorfer Daniel Pötscher Andreas Breit Luccas Hansch Roman Schick Christian Immitzer

CSV To CTE Transformer In Angular 18

By
Published in Comments (24)

When generating reports at work, I make heavy use of common table expressions (CTE) as well as the VALUES / ROW construct to create derived tables in MySQL 8. Thanks to the awesome power of SublimeText multi-cursor functionality, going from CSV (comma separated values) to CTE only takes a minute or two; but, it's a repetitive task that I'm keen to create a utility for (see my blog's utilities section). As such, I wanted to create a rough draft of this utility using Angular 18.

Run this demo in my JavaScript Demos project on GitHub.

View this code in my JavaScript Demos project on GitHub.

To get a high-level sense of what I'm talking about here, I want to take a CSV text payload like this:

ID,EMAIL
1,ben@example.com
2,laura@example.com
3,smith@example.com

And generate a MySQL query that looks like this:

WITH
	derived (
		id,
		email
	) AS (
		VALUES
			ROW ( 1, 'ben@example.com' ),
			ROW ( 2, 'laura@example.com' ),
			ROW ( 3, 'smith@example.com' )
	)
SELECT
	d.id,
	d.email
FROM
	derived d
;

I can then take this base SQL statement and manually add additional INNER JOIN clauses for on-the-fly reporting.

When converting a CSV value to a CTE like the one above, there are three steps involved:

  1. Parse the CSV text into a two-dimensional rows array.

  2. Map the CSV fields onto SQL query columns.

  3. Render the common table expression using the desired mappings.

To get this done, the first thing I did was create a simple CsvParser class (see code at end of post). It's not the most flexible CSV parser; but, it should get the job done for the type of data that I'm usually working with (ie, clean, well formatted data being generated by spreadsheet exports).

Once I had my CSV parser, I was able to build the primary CSV-to-CTE experience. Since every CSV payload is different, I had to include a way to customize the mappings between the two constructs. For this, I'm using Angular's powerful template-based forms module. Template-based forms keep both the logic and the markup simple while also providing for completely dynamic form fields.

In this case, I have to generate several form fields for each field within the CSV payload. Each CSV field needs both a custom name and a flag indicating whether or not to include it in the generated SQL query. I'm representing this data as an array of FieldSetting objects:

type FieldSettings = FieldSetting[];

interface FieldSetting {
	fieldIndex: number;
	columnName: string;
	exampleValue: string;
	includeInCte: boolean;
}

The magic of Angular's NgModel directive is that it can bind to object properties (such as those in the FieldSetting instance). As such, I can use the @for control flow to output a dynamic set of form inputs and then bind them to the underlying view-model. It's almost too easy!

Here's the user interface for my transformer. It's the CSV textarea (input) followed by the dynamic list of field mappings followed by the CTE textarea (output).

<p>
	This takes a CSV (comma-separated values) payload and transforms it into a custom CTE
	(common table expression) that can be used in MySQL 8.
</p>

<p class="flex-row">
	<button
		(click)="applySampleTemplate( sampleTemplate1 )"
		class="sample">
		Try with sample data
	</button>
	<button
		(click)="applySampleTemplate( sampleTemplate2 )"
		class="sample">
		Try less data
	</button>
</p>

<form>

	<h2>
		Step 1: Paste CSV Text
	</h2>

	<p class="csv">
		<textarea
			name="csvInput"
			[(ngModel)]="csvInput"
			(ngModelChange)="processInput()"
			class="csv__input"
		></textarea>

		<label class="csv__header flex-row">
			<input
				type="checkbox"
				name="ignoreFirstRow"
				[(ngModel)]="ignoreFirstRow"
				(ngModelChange)="processInput()"
			/>
			Ignore first row (column headers).
		</label>
	</p>

	<h2>
		Step 2: Configure Field-to-Column Mappings
	</h2>

	@for ( fieldSetting of fieldSettings ; track $index ) {

		<p class="field">
			<span class="flex-row">
				<input
					type="text"
					name="fieldName{{ $index }}"
					[(ngModel)]="fieldSetting.columnName"
					(ngModelChange)="renderCte()"
					class="field__name"
				/>
				<label class="flex-row">
					<input
						type="checkbox"
						name="includeField{{ $index }}"
						[(ngModel)]="fieldSetting.includeInCte"
						(ngModelChange)="renderCte()"
					/>
					include in SQL.
				</label>
			</span>
			<span class="field__example">
				Example: {{ fieldSetting.exampleValue }}
			</span>
		</p>

	}

	<h2>
		Step 3: Copy SQL Statement For Fun And Profit
	</h2>

	<p class="cte">
		<textarea
			name="cteOutput"
			[(ngModel)]="cteOutput"
			class="cte__output"
		></textarea>
	</p>

</form>


<!-- Sample CSV data that be loaded into the demo. -->
<template #sampleTemplate1>
	... truncated for demo ...
</template>
<template #sampleTemplate2>
	... truncated for demo ...
</template>

Just look at how simple NgModel keep the markup. Freaking, Angular, AMIRIGHT?!

At the end of that HTML I have a few <template> elements that provide some sample data. This helped me develop the app; but, it should also make the demo a bit more exciting.

Here's the code-behind for this experience:

// Import vendor modules.
import { Component } from "@angular/core";
import { FormsModule } from "@angular/forms";
import { inject } from "@angular/core";

// Import app modules.
import { CsvParser } from "./csv-parser";
import { CsvRow } from "./csv-parser";
import { CsvRows } from "./csv-parser";

// ----------------------------------------------------------------------------------- //
// ----------------------------------------------------------------------------------- //

type FieldSettings = FieldSetting[];

interface FieldSetting {
	fieldIndex: number;
	columnName: string;
	exampleValue: string;
	includeInCte: boolean;
}

// ----------------------------------------------------------------------------------- //
// ----------------------------------------------------------------------------------- //

@Component({
	selector: "app-root",
	standalone: true,
	imports: [
		FormsModule
	],
	styleUrl: "./app.component.less",
	templateUrl: "./app.component.html"
})
export class AppComponent {

	private csvParser = inject( CsvParser );

	public csvInput = "";
	public csvRows: CsvRows = [];
	public ignoreFirstRow = false;
	public fieldSettings: FieldSettings = [];
	public cteOutput = "";

	// ---
	// PUBLIC METHODS.
	// ---

	/**
	* I apply and process the given sample data template reference.
	*/
	public applySampleTemplate( sampleTemplate: HTMLTemplateElement ) {

		this.csvInput = sampleTemplate.content.textContent!.trim();
		this.ignoreFirstRow = true;
		this.processInput();

	}


	/**
	* I process the CSV input and generate the CTE output.
	*/
	public processInput() {

		this.csvRows = this.csvParser.parse( this.csvInput );
		this.fieldSettings = [];
		this.cteOutput = "";

		// If we have no rows, there's nothing else to process.
		if ( ! this.csvRows.length ) {
			
			return;

		}

		// If we only have one row and we want to ignore it, there's nothing else to
		// process.
		if ( this.ignoreFirstRow && ( this.csvRows.length === 1 ) ) {

			return;

		}

		// If we ignore the first row, it means that the first row is the field headers.
		// We can therefore use the result of the splice() operation to default the names
		// of the fields in the SQL statement.
		var splicedRow: CsvRow = [];

		if ( this.ignoreFirstRow ) {

			splicedRow = this.csvRows.splice( 0, 1 )
				.at( 0 )!
				.map( ( value ) => value.replace( /\s+/g, "" ) )
			;

		}

		// We're going to assume that the row structures are uniform and use the first row
		// as a representation of the overall row structure. This will be used to define
		// the default field names (giving precedence to the spliced row above).
		this.fieldSettings = this.csvRows[ 0 ].map(
			( fieldValue, i ) => {

				return {
					fieldIndex: i,
					columnName: ( splicedRow[ i ] || `f${ i + 1 }` ),
					exampleValue: fieldValue,
					includeInCte: true
				};

			}
		);

		this.renderCte();

	}


	/**
	* I render the CTE output based on the current view-model.
	*/
	public renderCte() {

		// Filter down to the fields we want to include in the CTE rendering.
		var settings = this.fieldSettings.filter(
			( fieldSetting ) => {

				return fieldSetting.includeInCte;

			}
		);

		// If none of the field are to be included, there's nothing left to process.
		if ( ! settings.length ) {

			this.cteOutput = "";
			return;

		}

		this.cteOutput = `
			WITH
				derived (
					${ derivedColumns( settings ) }
				) AS (
					VALUES
						${ derivedRows( settings, this.csvRows ) }
				)
			SELECT
				${ selectColumns( settings ) }
			FROM
				derived d
			;
		`;

		// Our string template approach added a bunch of extra indentation on each line.
		// Let's remove that indentation to create a properly formatted SQL statement.
		this.cteOutput = this.cteOutput
			.trim()
			.replace( /^\t{3}/gm, "" )
		;

		// Function-local helper methods.

		function derivedColumns( settings: FieldSettings ) {

			return settings
				.map(
					( setting ) => {

						return `\`${ setting.columnName }\``;

					}
				)
				.join( ",\n					" )
			;

		}

		function derivedRows( settings: FieldSettings, rows: CsvRows ) {

			return rows
				.map(
					( row ) => {

						return `ROW ( ${ derivedRow( settings, row ) } )`;

					}
				)
				.join( ",\n						" )
			;

		}

		function derivedRow( settings: FieldSettings, row: CsvRow ) {

			return settings
				.map(
					( setting ) => {

						return `'${ row[ setting.fieldIndex ].replace( /'/g, "''" ) }'`;

					}
				)
				.join( ", " )
			;

		}

		function selectColumns( settings: FieldSettings ) {

			return settings
				.map(
					( setting ) => {

						return `d.\`${ setting.columnName }\``;

					}
				)
				.join( ",\n				" )
			;

		}

	}

}

If I now run this Angular code and populate the CSV input, I get the following output:

As you can see, the CSV payload has become a CTE SQL query.

As a final note, here's the CSV parser that I created for the demo:

// Import vendor modules.
import { Injectable } from "@angular/core";

// ----------------------------------------------------------------------------------- //
// ----------------------------------------------------------------------------------- //

export type CsvRows = CsvRow[];
export type CsvRow = string[];

// This is the native match that exec() will return.
type ExecMatch = RegExpExecArray | null;
// This is the translated match that we use internally.
type ParserMatch = FieldMatch | null;
// This is the translated match interface.
interface FieldMatch {
	value: string;
	isQuotedValue: boolean;
	isNakedvalue: boolean;
	delimiter: string;
	isFieldDelimiter: boolean;
	isRowDelimiter: boolean;
	isEndOfInput: boolean;
}

// ----------------------------------------------------------------------------------- //
// ----------------------------------------------------------------------------------- //

@Injectable({
	providedIn: "root"
})
export class CsvParser {

	/**
	* I parse the given input into a CSV rows result.
	*/
	public parse( input: string ) : CsvRows {

		// Note: This is a judgement call, but stripping off any trailing newlines at the
		// end of the input seems to result in a more meaningful match. Otherwise you can
		// end up with some empty row at the end depending on the algorithm.
		input = input.replace( /[\r\n]+$/, "" );

		if ( ! input ) {

			return [];

		}

		var pattern = this.createPattern();
		var match: ParserMatch;
		var row: CsvRow = [];
		var rows: CsvRows = [row];

		while ( match = this.transformMatch( pattern.exec( input ) ) ) {

			row.push( match.value );

			// Because our pattern can match an empty field followed by the end-of-input,
			// there's a zero-length match at the end of the input sequence that never
			// moves the lastIndex of the exec() function forward. As such, we have to
			// explicitly break out of the loop once we've reached the end.
			if ( match.isEndOfInput ) {

				break;

			}

			if ( match.isRowDelimiter ) {

				row = [];
				rows.push( row );

			}

		}

		return rows;

	}

	// ---
	// PRIVATE METHODS.
	// ---

	/**
	* I create the RegExp instance for matching fields and delimiters.
	*/
	private createPattern() : RegExp {

		// Note: The String.raw tag will return the embedded escape sequences (\r\n) as
		// string literals instead of interpreting them as escape sequences.
		var source = String.raw`
			(?:
				"([^"]*(?:""[^"]*)*)"    # Quoted field value.
				|
				([^",\r\n]*)             # Naked field value.
			)
			(,|\r\n?|\n|$)               # Field delimiter.
		`;

		// The native RegExp doesn't support the "verbose" flag. As such, let's strip out
		// any comments and whitespace in the current source.
		source = source
			.replace( /# [^\r\n]*/g, " " )
			.replace( /\s+/g, "" )
		;

		return new RegExp( source, "gy" );

	}


	/**
	* I transform the native exec() match into a local parser match.
	*/
	private transformMatch( match: ExecMatch ) : ParserMatch {

		if ( ! match ) {

			return null;

		}

		var value = ( match[ 1 ] || match[ 2 ] || "" );
		var isQuotedValue = ( match[ 1 ] !== undefined );
		var isNakedvalue = ! isQuotedValue;
		var delimiter = match[ 3 ];
		var isFieldDelimiter = ( delimiter === "," );
		var isRowDelimiter = ! isFieldDelimiter;
		var isEndOfInput = ! delimiter;

		// If the field value is quoted, canonicalize any embedded quotes.
		if ( isQuotedValue ) {

			value = value.replace( /""/g, '"' );

		}

		return {
			value,
			isQuotedValue,
			isNakedvalue,
			delimiter,
			isFieldDelimiter,
			isRowDelimiter,
			isEndOfInput
		};

	}

}

While I've been using regular expressions for ages, I don't think I've ever used the .exec() method in a TypeScript setting before. It was good to see that Angular's error messages pointed me in the right direction (vis-a-vis typing).

Want to use code from this post? Check out the license.

Reader Comments

15,827 Comments

@Chris,

Ha ha, the same, only much harder 🤪 the template-based form stuff is something Angular really just nailed from the starting-gate. I think React is only just now starting to embrace the idea of "uncontrolled form inputs" and two-way data binding (they've always allowed for uncontrolled inputs; but, I think I heard that they are starting to realize that it just makes form interactions so much easier ... but, I could be talking out my behind).

It would be curious to see this in React - I wonder if I can just paste it into Chat GPT.

438 Comments

Great stuff. Ben.

Now I am guessing this:

private csvParser = inject( CsvParser );

Is a new Angular 16+ feature that allows us to inject stuff without using a constructor. This is a really great little feature.

Oooo. Ooo. And I also see you are using a box of bananas 📦 🍌

[(ngModel)]="cteOutput"

Don't you just love the flexibility of bi-directional data flow that React cannot match! ☺️

438 Comments

And a few questions about the MySQL stuff.

What exactly is a derived table?
I don't think I have ever used one of these?
Are they like a temporary table?
I presume the Row() function is similar to a normal table INSERT?
Is this a MySQL 8 feature?

438 Comments

OK. I am beginning to think that this is just a plain old sub query.

For one moment, I thought that derived() was some new funky MySQL 8 method.
I've just done a Google on it and all I can find for derived is sub query.
It's just that ROW() method threw me a bit, along with the With keyword.

438 Comments

Ben.

It would be really great if you could do a mini tutorial based on the MySQL, at the top of this article.
I am trying to get my head around how it all works and what the ROW() method actually does.

I have done some Googling on ROW() and there isn't much out there.

The only thing that I have gleaned so far, is that it compares different sub queries with two or more columns?

15,827 Comments

@Charles,

Yes, the inject( CsvParser ) stuff is Angular's ability to do dependency-injection without having to do constructor injection. Though, I think that's actually what it is doing under the hood during the compilation of the code. You can only use inject() in "constructor contexts" for this reason, I assume.

As far as the SQL stuff, you are correct - the common table expression (CTE) is really just storing a sub-query into a variable. In this case, I'm using the VALUES / ROWS combination, but you can just as easily use a SELECT:

WITH derived AS (

	SELECT * FROM other_table
	ORDER BY id DESC
	LIMIT 10

)
SELECT
	*
FROM
	derived

Note that with the SELECT I don't have to define the column names since that's handled by the SELECT itself. In the example in my blog post, since there's no SELECT, I have to be more explicit about the column names.

And yes, there's nothing special about the term derived - that's just the variable name I used. I could have used:

WITH temp AS or WITH users AS - I only use derived because that's what I'm used to using.

As for the ROW stuff, yeah, I think that was introduced in MySQL 8 as well. I did a closer look at that previously if you're curious:

www.bennadel.com/blog/4626-using-values-row-to-create-derived-table-from-static-values-in-mysql-8-0-19.htm

It's basically just a way to map values onto columns in the derived table. So, stead of having to use a bunch of UNION ALL calls to construct a table:

SELECT 1 AS id, 'ben@example.com' AS email UNION ALL
SELECT 2 AS id, 'sarah@example.com' AS email UNION ALL
SELECT 3 AS id, 'john@example.com' AS email

... you can just use ROW() to define each row. Once you start using it, it's quite a nice developer experience.

And YES, the two-data binding is AMAZING! 🎉

438 Comments

@Ben Nadel,

Thanks for this explanation.

So, just to clarify, I presume the reason you didn't use:

WITH derived AS (
SELECT * FROM other_table
ORDER BY id DESC
LIMIT 10
)
SELECT
*
FROM
derived

Is because you weren't building a derived table from an existing table.
In your example, you were building the derived table from scratch?

Now I was reading that the difference between CTE and a sub query is that CTE can deal with recursion.

Do you have any more detail on the recursion part? I am not entirely sure what recursion is referring to, in this context?

15,827 Comments

@Charles,

Exactly right. Since I wasn't using a sub-query, but rather constructing actual rows, something somewhere has to define what the column names are called. And, in this particular demo, the user can customize each field name via the Angular view.

As far as recursion, I looked it up one time but I don't think I've ever tried it out. I think the canonical example is usually something like a file/folder tree where there is a hierarchy across the queries. But, I've never actually put something like that into practice at the DB level (normally I would pull more records back and then perform the hierarchy calculations in the application code).

438 Comments

@Ben Nadel,

Cheers for comfirmation.

I managed to find this simple example, which has helped me understand the basics:

WITH RECURSIVE x2 (result) AS (
    SELECT 1
    UNION ALL
    SELECT result*2 FROM x2)
SELECT * FROM x2 LIMIT 10;
result 
-------- 
      1 
      2 
      4 
      8 
     16 
     32 
     64 
    128 
    256 
    512 
(10 rows)
15,827 Comments

Yeah, that looks bananas 😜 I don't think I'll ever use something like that in my app DB (feels like something that should be in the app code itself); but, you've inspired me to at least look at it little more closely.

15,827 Comments

Over on LinkedIn, Gurpreet asked for an example of where I might use this technique. I've been using it a ton lately as we start to send out emails about the shut-down of our system. Let me provide some context:

I have an account representative who needs to communicate with our enterprise clients about when their enterprise contracts are ending. These emails are all being sent out through a specialized workflow which looks at an "email verification" table before sending. Since we have to keep our bounce rate low, we don't want to risk send emails to people whose email address may no longer be valid.

Aside: See the post, "Lessons Learned From Sending 7 Million Emails In 5 Days Using ColdFusion", for more details on my email adventures.

So, when my account rep comes to be and says she needs to send out an email to a given set of enterprises, she'll give me the list of emails that she wants to target. Then, before we do any sending, I'll compare her list to the email validation cache to see which emails haven't been validated (ie, don't have a corresponding record in the DB yet) and which will be skipped due to current validation.

The query for this looks something like this:

-- Setup my derived table with the emails
-- provided by my account representative.
WITH derived ( `email` ) AS (
	VALUES
		ROW ( 'bob@example.com' ),
		ROW ( 'sarah@example.com' ),
		ROW ( 'tig@example.com' ),
		-- ... hundreds of emails ...
)
-- Compare the above to our internal cache.
SELECT
	SUBSTRING_INDEX( d.email, '@', -1 ) AS domain,
	d.email,
	c.textCode
FROM
	derived d
LEFT OUTER JOIN
	email_validation_cache c
ON
	c.email = d.email
WHERE
	-- no matching record.
	c.textCode IS NULL
OR
	-- record which indicates a future omission.
	c.textCode NOT IN ( 'ok', 'valid' )
ORDER BY
	domain ASC,
	d.email ASC
;

Then, I'll take those results and put them in a Google sheet for the account rep to review. For users that don't have any validation record (c.textCode IS NULL), she'll run those emails through a validation service like NeverBounce or Email List Verify (and we'll update the internal cache with the new results). For companies that have too many skipped users, she'll do some research to see if there are other emails in the company that we can include in the mail-serve.

It's these kind of out-of-app, manually-orchestrated queries that are made much easier from the CTE technique.

438 Comments

@Ben Nadel,

That's interesting, but it seems to me that CTEs are used mainly for readability. Because there doesn't seem to be much difference between a sub query & a CTE, unless you use the RECURSIVE keyword? Or unless you create a temporary table from scratch, as you have in the example above. 🤔

15,827 Comments

Yeah, I just deleted the comments. I wasn't sure at first, but the second comment (first was on a different post) confirmed.

233 Comments

@Ben Nadel,

Good stuff! First time I'm seeing this page. Is it a hidden URL, or is it hooked into the site's navigation somewhere? If it is, I couldn't find it.

15,827 Comments

@Chris,

I link to it in the site footer. I only just added this page a few weeks ago. It was just something I've been thinking about doing for a while. It's mostly for my own usage; but, I figured I'd make it public for anyone else that might find it helpful.

233 Comments

@Ben Nadel,

Ah, I see it now! Thanks for pointing that out... Wasn't sure if I needed to bookmark it to find it again. I appreciate you, all that you do, and all that you share. 🙏

15,827 Comments

You are very kind, good sir :) If you have any suggestions for utilities, let me know. I'm just keeping notes as I think to myself, "man, I should build something for XYZ."

Post A Comment — I'd Love To Hear From You!

Post a Comment

I believe in love. I believe in compassion. I believe in human rights. I believe that we can afford to give more of these gifts to the world around us because it costs us nothing to be decent and kind and understanding. And, I want you to know that when you land on this site, you are accepted for who you are, no matter how you identify, what truths you live, or whatever kind of goofy shit makes you feel alive! Rock on with your bad self!
Ben Nadel