How to quickly judge the quality of a database
When I start using a new database, I usually run the following queries to find how good or bad the project is going to be:
select count(*) as [Number Of Tables] from information_schema.tables
Anything bigger than a couple of hundreds, and I start to feel really nervous. Ideally, it is around 20 - 50 main ones, and maybe additional dozen tables for constants. I have worked on databases where then numbers are in the thousands (and yes, I did touch every table)
select
table_name, count(*) as [Number Of Columns In Table] from information_schema.columns
group by table_name
order by count(*) desc
If you see a table with more than 25 - 30 columns, this is a sign of a big problem. Either the database is not normalizied, the data model is completely wierd, or you are looking into a "the database in a single table".
The following is my number one criteria for the quality of the database:
select
column_name, count(*) [Number
Of Columns Named This Way] from information_schema.columns
where
data_type = 'DATETIME'
group by column_name
order by count(*) desc
The important things to note here are:
- The number of rows returned, the more there are, the worst you are.
- The commonly named columns are very important. I often see things (in Hebrew) that looks like this: Taarich / Taarih / Tarich / Taaizh / Taar - Different ways to transliterate Date from Hebrew to English. When I see those prolifer, I know that I am in trouble.
A variant of the above is to run it over all columns regardless of datatype, and check for common names, but I found that focusing on the dates is fairly accurate.
Comments
Comment preview