Skip to content

More schema reading

martinjw edited this page Dec 7, 2018 · 5 revisions

Reading parts of the schema

Most database schemas can be read simply:

var dbReader = new DatabaseReader(connectionString, providername);
dbReader.Owner = "dbo";
var schema = dbReader.ReadAll();

In .net Core, you must create and manage the connection, but otherwise the Read methods are identical:

using (var connection = new SqlConnection("Data Source=.\SQLEXPRESS;Integrated Security=true;Initial Catalog=Northwind"))
{
    var dr = new DatabaseSchemaReader.DatabaseReader(connection);
    var schema = dr.ReadAll();
    //do something with your schema - see other pages in the wiki!
}

But some databases have thousands of tables. Loading all that takes time, and a significant amount of memory. The DatabaseSchemaReader allows you to read specific parts of the schema, although you will lose some of the cross-references in the schema model.

CancellationTokens and events

In .net 4.0+, you can use overloads with a CancellationToken to stop long running operations.

var cancelToken = new CancellationTokenSource();
Task.Factory.StartNew(dbReader.ReadAll, cancelToken.Token);
Thread.Sleep(1000); //give it a second
cancelToken.Cancel(false);
//may be a partially loaded schema
var schema = dbReader.DatabaseSchema;

For more insight into what's happening, there is a ReaderProgress event.

dbReader.ReaderProgress += (object sender, ReaderEventArgs e) =>
{
	Console.WriteLine(e.ProgressType); //reading or processing
	Console.WriteLine(e.SchemaObjectType); //tables, columns, views ....
	Console.WriteLine(e.Name); //name of table/view/procedure 
	if (e.Index.HasValue) //if iterating through collection
	{
		Console.WriteLine(e.Index + " of " + e.Count); //eg on 10 of 100 tables
	}
};

Excluding and including

When you read a large database, you can exclude certain elements. The standard filter is a list of names, but you can substitute the built-in PrefixFilter, or your own class inheriting from Filter or IFilter.

//we don't want _MigrationHistory table
dbReader.Exclusions.TableFilter.FilterExclusions.Add("_MigrationHistory");
//remove the spCreateDiagram system sprocs
dbReader.Exclusions.StoredProcedureFilter = new PrefixFilter("sp");

There is also a built-in InclusionFilter, which inverts the logic: it includes only the names you specify.

dbReader.Exclusions.TableFilter = new InclusionFilter {FilterExclusions = {"Categories", "Products" } };

Reading only tables and views

ReadAll() is reading everything from the schema- tables, views, columns, constraints, stored procedures. If you don't care about stored procedures, you can get just tables and views.

dbReader.AllTables();
dbReader.AllViews();
//dbReader.AllStoredProcedures(); //but not this one!
var schema = dbReader.DatabaseSchema;

Reading only one table

If you know the name of a table, you can load only that table. Some reference properties will be null, because we only have a partial model.

//load a single table
dbReader.DataTypes(); //load the datatypes - not required, but assigns column.DataType property
var products = dbReader.Table("Products");
Console.WriteLine(products.Columns.Count); //all the columns are there
Console.WriteLine(products.PrimaryKeyColumn.Name); //the primary key column
Console.WriteLine(products.ForeignKeys.Count); //all the foreign keys are there
//we don't have foreign keys pointing here (this is the principal end of a relationship)
Console.WriteLine(products.ForeignKeyChildren.Count);

var categoryId = products.FindColumn("CategoryID");
Console.WriteLine(categoryId.IsForeignKey == true); //shows true
Console.WriteLine(categoryId.ForeignKeyTableName); //shows "Categories"
Console.WriteLine(categoryId.ForeignKeyTable); //is null, because we haven't loaded it yet

Note the reader.DataTypes() call is optional. If called, it allows the column.DataType property to be assigned.

Reading incrementally

The TableList() method gives a list of tables with only the name. There are no columns or constraints, so it is fast and efficient.

We can then read the data in pages:

var tableList = dbReader.TableList();
//take first 10 only
var max = tableList.Count < 10 ? tableList.Count : 10;
for (int i = 0; i < max; i++)
{
	var table = tableList[i];
	//load the full table
	var fullTable = dbReader.Table(table.Name);
	Console.WriteLine(fullTable.Columns.Count);
}

Fixing up references

When you load a single table, you have scalar properties, so you know the foreign key is to a table name, but not reference properties, so no link to the other table.

To update all the bi-directional references, use the schema fixer:

DatabaseSchemaFixer.UpdateReferences(schema);

Columns have a string DbDataType property. In addition, there is a DataType reference property, which gives more information (including .net equivalent type). This is loaded in ReadAll(), but not the other methods. You can load the data types from the database reader first (which is what ReadAll does internally).

dbReader.DataTypes();
var products = dbReader.Table("Products");
//now columns has a .DataType property

Command Timeout

With large databases with thousands of tables and columns, you may have ADO command timeouts. You can read incrementally (see above), or, from v2.7.0, you can increase the default command timeout (normally 30 seconds). It sets the IDbCommand.CommandTimeout (https://docs.microsoft.com/en-us/dotnet/api/system.data.idbcommand.commandtimeout) of every command it uses.

var dbReader = new DatabaseReader(dbConnection);
dbReader.Owner = "dbo";
dbReader.CommandTimeout = 0; //The time in seconds to wait for the command to execute.
var schema = dbReader.ReadAll();