I have recently inherited a set of very large SQL Server databases. the application and database schema are a mess. I have run across a few fields in the database that store different types of sensitive data, where they should not be stored. since there are almost 10,000 tables in my database, I am in desperate need of a way to programmatically scan a few of these databases to find out where the data is. I realize this will be very resource intensive, so I have setup a server specifically to run a scan on backups of the databases.
I also have zero dollars for purchasing any tools.
Does anyone know of a way with C# and SQL that I can scan all user tables in the database for sensitive data?
an example of scanning for one type of data (eg. SSN) would be extremely helpful. I confident that I can extrapolate that into all the scenarios I would need.
this sql will list all user tables and row counts in a database. It will be a starting point..
SELECT o.name,
ddps.row_count
FROM sys.indexes AS i
INNER JOIN sys.objects AS o ON i.OBJECT_ID = o.OBJECT_ID
INNER JOIN sys.dm_db_partition_stats AS ddps ON i.OBJECT_ID = ddps.OBJECT_ID
AND i.index_id = ddps.index_id
WHERE i.index_id < 2 AND o.is_ms_shipped = 0 ORDER BY o.NAME
Hth,
O
This query will help you to find the column with particular name and datatype
SELECT t.name AS table_name,
SCHEMA_NAME(t.schema_id) AS schema_name,
c.name AS column_name ,tp.name
FROM sys.tables AS t
INNER JOIN sys.columns c ON t.OBJECT_ID = c.OBJECT_ID
INNER JOIN sys.types tp ON tp.user_type_id=c.user_type_id
WHERE c.name LIKE '%Product%' AND tp.name LIKE '%int%'
ORDER BY schema_name, table_name;
This might be irrelevant at this point of time but shall serve as an additional note: You can use Information Schema Views to query the database objects which comply with the ISO standard definition for the INFORMATION_SCHEMA.
MSDN LINK
If you can open the DB into Microsoft SQL Server Managment Studio, you can try to use ApexSQL . It's a plugin that can be downloaded from here:
http://www.apexsql.com/sql_tools_search.aspx
For example: you select the database and you can look for a column name. It will show you all tables in which you have that column.
Hope it helps.
Related
We're using .NET Entity Framework to talk to an Azure SQL database. We used QueryOriginInterceptor to add some comments to the top of each SQL command being sent to SQL Server, with the goal of helping identify the location where a particular query came from in the code.
The problem is, when looking at long running queries in the Azure UI (and looking in sys.dm_exec_query_stats), the comments are not there.
For example, if we run this query:
-- Stack:
-- Utils.Orders.GetOrders
select *
from [Order] o
join OrderItem oi on oi.OrderId = o.ID
And looking in Azure, the long running query looks like:
Is there a way to preserve these comments?
sys.dm_exec_query_stats does not include the comments, but dm_exec_sql_text does.
This artice explains how to use the two to diagnose issues.
The relevant SQL query from the article is:
SELECT TOP 25
databases.name,
dm_exec_sql_text.text AS TSQL_Text,
CAST(CAST(dm_exec_query_stats.total_worker_time AS DECIMAL)/CAST(dm_exec_query_stats.execution_count AS DECIMAL) AS INT) as cpu_per_execution,
CAST(CAST(dm_exec_query_stats.total_logical_reads AS DECIMAL)/CAST(dm_exec_query_stats.execution_count AS DECIMAL) AS INT) as logical_reads_per_execution,
CAST(CAST(dm_exec_query_stats.total_elapsed_time AS DECIMAL)/CAST(dm_exec_query_stats.execution_count AS DECIMAL) AS INT) as elapsed_time_per_execution,
dm_exec_query_stats.creation_time,
dm_exec_query_stats.execution_count,
dm_exec_query_stats.total_worker_time AS total_cpu_time,
dm_exec_query_stats.max_worker_time AS max_cpu_time,
dm_exec_query_stats.total_elapsed_time,
dm_exec_query_stats.max_elapsed_time,
dm_exec_query_stats.total_logical_reads,
dm_exec_query_stats.max_logical_reads,
dm_exec_query_stats.total_physical_reads,
dm_exec_query_stats.max_physical_reads,
dm_exec_query_plan.query_plan,
dm_exec_cached_plans.cacheobjtype,
dm_exec_cached_plans.objtype,
dm_exec_cached_plans.size_in_bytes
FROM sys.dm_exec_query_stats
CROSS APPLY sys.dm_exec_sql_text(dm_exec_query_stats.plan_handle)
CROSS APPLY sys.dm_exec_query_plan(dm_exec_query_stats.plan_handle)
INNER JOIN sys.databases
ON dm_exec_sql_text.dbid = databases.database_id
INNER JOIN sys.dm_exec_cached_plans
ON dm_exec_cached_plans.plan_handle = dm_exec_query_stats.plan_handle
WHERE databases.name = 'AdventureWorks2014'
ORDER BY dm_exec_query_stats.max_logical_reads DESC;
I have 3 tables (for example 3, but in real over than 30 tables with this conditions) in my SQL Server database: post, user, person.
post: (post_id, post_text, user_id)
user: (user_id, user_name, person_id)
person: (person_id, person_phone, person_email)
Now, in C#, I want an algorithm that creates a query that get result like this:
post.post_id, post.post_text, post.user_id, user.user_id, user.user_name, user.person_id, person.person_id, person.person_email
and I use this method for fill a SqlDataReader in C# for reading and accessing all column values from these records.
I know that the common way to get that result directly and manually using of 'Join' statement, but it is waste time if tables count is very much. So, I want an algorithm that generates this query for manipulate in C# programming.
Thanks a lot.
To query all column names from your tables, you can use:
SELECT obj.Name + '.' + col.Name AS name
FROM sys.columns col
INNER JOIN sys.objects obj
ON obj.object_id = col.object_id
WHERE obj.Name IN ('post', 'user', 'person')
ORDER BY name
Then, for how to call this from C# using SqlDataReader, you can use this documentation: https://msdn.microsoft.com/en-us/library/haa3afyz(v=vs.110).aspx
select post1.post_id, post1.post_text, post1.user_id, user1.user_id, user1.user_name, user1.person_id, person1.person_id, person1.person_email from post post1 inner join user user1 on user1.user_id=post1.user_id inner join person person1 on person1.person_id=user1.person_id
I have two tables, they are indexed in the azure database manager. So i have the foreign key in the second table.
My tables are for example
OrderTable (OrderId,OrderDate,CustomerId) /CustomerId is my foreign key
Customer Table( CustomerId,CustomerName,....)
So i just want a query like this:
Select *
From OrderTable o1,CustomerTable c1
Where c1.CustomerId=o1.CustomerId
I used the microsoft sample TodoItems, and i already can make querys on one table like this:
items = await todoTable
.Where(todoItem => todoItem.Date >= DateTime.Now)
.ToCollectionAsync();
.ToListAsync();
So in my app i got the two table, is there any option to query the joined tables like the one above ?
You can perform joins in LINQ, but in your situation, it's probably easier to create a view that does the join and then select from that using LINQ.
Also, you should avoid using the older join syntax as you have - it will stop being supported at some point - and use the INNER JOIN clause, i.e.
SELECT * FROM OrderTable o1 INNER JOIN CustomerTable c1
ON c1.CustomerId = o1.CustomerId
Create a view like Rikalous pointed out. You can do this by clicking on Sql Databases on your windows azure portal. Select your server and then click on the "Manage URL" located on the Dashboard page at the bottom right.
Once you login, click on "New Query" and then just type in the sql code to create a view.
CREATE mySchema.myView AS
SELECT * FROM Table t1 INNER JOIN OtherTable t2 ON t1.a=t2.b
Once your view is created, go back to your Windows Azure Portal. Go to your Mobile Service and create a new table. Create the table with the name of your view, the system will detect the view and present it to you. You will see that no default columns are present nor any data will show up. But you will be able to query it as any other table, plus you can also modify its insert/update/read scripts.
*Important: double check that your view is created on the correct schema. Also double check after adding the table on Mobile Services that no table was created on the server.
I've got a scenario where I will need to order by on a column which is a navigation property for the Users entity inside my EF model.
The entities:
Users --> Countries 1:n relationship
A simple SQL query would be as follows:
SELECT UserId, u.Name, c.Name
FROM users u join countries c on u.CountryId = c.CountryId
ORDER BY c.Name asc;
So then I tried to replicate the above SQL query using Linq to Entities as follows - (Lazy Loading is enabled)
entities.users.OrderBy(field => field.country.Name).ToList();
But this query does not return my countries sorted by their name as the native SQL query above does.
However I continued a bit more and did the following:
var enumeratedUsers = entities.users.AsEnumerable();
users = enumeratedUsers.OrderBy(fields => fields.country.Name).ToList();
But ordering on the enumeratedUser object for about 50 records took approx. 7seconds
Is there a better way how to omit the Enumerable and without returning an anonymous type?
Thanks
EDIT
I just forgot to say that the EF provider is a MySQL one not a MS SQL. In fact I just tried the same query on a replicated database in MS SQL and the query works fine i.e. the country name is ordered correctly, so it looks like I have no other option apart from getting the result set from MySQL and execute the order by from the memory on the enumerable object
var enumeratedUsers = entities.users.AsEnumerable();
users = enumeratedUsers.OrderBy(fields => fields.country.Name).ToList();
This is LINQ to Objects not LINQ to Entities.
Above Order By clause will call OrderBy defined in Enumerable
That is ordering will be done in memory. Hence it will take long time
Edit
It looks like a MySQL related issue
You may try something like this.
var users = from user in entities.users
join country in entities.Country on user.CountryId equals country.Id
orderby country.Name
select user;
entities.users.OrderBy(field => field.country.Name).ToList();
But this query does not return my countries sorted by their name as the native
SQL query above does.
Yes, it does not return Countries but only Users sorted by the name of country.
When this query is executed, the following sql is sent to DB.
SELECT u.*
FROM users u join countries c on u.CountryId = c.CountryId
ORDER BY c.Name asc;
As you can see, the result does not include any fields of countries. As you mentioned the lazy loading, countires are loaded through it when needed. At this time, countries are ordered as the order you call it through the lazy loading. You can access countries through the Local property of a entity set.
This point tells you that if you want user sorted by the name of country and also countires sorted by the name, you need the eagerly loading as #Dennis mentioned like:
entities.users.Include["country"].OrderBy(field => field.country.Name).ToList();
This is converted to the following sql.
SELECT u.*, c.*
FROM users u join countries c on u.CountryId = c.CountryId
ORDER BY c.Name asc;
Have you tried using Include?
entities.users.Include["country"].OrderBy(field => field.country.Name).ToList();
SOLUTION
Since I had both columns named Name in both Countries and Users table MySQL Connector was generating this output when order by country.Name was executed:
SELECT `Extent1`.`Username`, `Extent1`.`Name`, `Extent1`.`Surname`, `Extent1`.`CountryId`
FROM `users` AS `Extent1` INNER JOIN `countries` AS `Extent2` ON `Extent1`.`CountryId` = `Extent2`.`CountryId`
ORDER BY `Name` ASC
therefore this will result in ordering on the users.Name rather countries.Name
However MySQL have release version 6.4.3 .NET connector which has resolved a bunch of issues one of them being:
We are also including some SQL generation improvements related to our entity framework provider. Source: http://forums.mysql.com/read.php?3,425992
Thank you for all your input. I tried to be clear as much as possible to help others which might encounter my same issue.
Is it possible to do an inner join on 2 tables where both the tables are on different server??
Add a linked server (B) to server A then write the following query
SELECT
*
FROM
[SERVERB].[DATABASE].[SCHEMA].[TABLE] A
INNER JOIN [SERVERA].[DATABASE].[SCHEMA].[TABLE] B ON A.ID = B.ID
It is certainly possible in SQL code. How you would do it in C# I don't know but in SQl Server, I would set up linked servers and then the code is:
select t1.field1, t2.field2
From server1.database1.dbo.table1 t1
join server2.database2.dbo.table2 t2
on t1.id = t2.id
So you just use the four part name instead of the three part name. But you do have to have a linked server set up first.
You can download both tables to the client, then perform a join using LINQ.
For more detail, please provide more details.
If you are using SQL Server try using a Linked Server, if Oracle use a database link. I am not sure how it would be achieved in the rest.