Marc Breault Ramblings

I have many interests ranging from religion to NFL football. This is a place where I ramble on about whatever I feel like rambling about.

Wednesday, February 19, 2025

No Reason for total Access

 I know CNN wants to get both sides of an issue but many of the Republican sympathizers downright lie.  Today some clown on the Situation room claimed Elon Musk has the same level of access everyone else at his level does.  BS.  What DOGE demands is read/write access to all of the real data.  In tech world we call real data Production data.  No one has read/write access to all of the data with the possible exception of database administrators (DBA).  People have access to individual records.  We have all phoned a government department and once we prove we are who we say we are someone brings our record up and deals with the issue.  But having access to individual records is a far cry from having access to all of the records for everyone.  No one has that, not even the head of the IRS or SSA or anyone.  The President of the United States does not have this access.  But Elon Musk tells the American people they need access to all of the data so they can look for fraud and waste.  How can you do that without having access to all of the data?

 

1.       Random audits.  The IRS and other organizations use random audits.  I got audited by the IRS once and they found I had made a mistake in my tax return (which I still fill in despite living in Australia) and they had made a mistake as well, so we sorted it out without difficulty.

2.      Transaction algorithms.  Fraud detection software looks for unusual patterns of transactions.  For example, if someone switches credit cards multiple times in a short span of time (5 credit card changes in September) this is flagged as a possible problem.  One method the IRS used to use is a Benford’s Law test.  Look that up.  It’s bizarre but it works.  The key thing to know about this type of search is initially, you don’t care who the people are.  You just look for patterns and these are just numbers.  If something is flagged, then you can find out who the human is, but you only look at that one individual’s record.

3.      Anti money laundering.  The US government publishes a blacklist that is updated regularly, and transactions can be matched against the blacklist.  This is published throughout the world.  I have used the US list myself.  But again, you initially only look for matches without caring who people are.  It’s only when there is a match that you look more closely into the individual account.

 

Then there is analytics and aggregation.  You might want to know, for example, how many Latino men over 45 years old and earning more than $70K/year file their tax returns late.  This sort of query is useful, and we call this business intelligence data or aggregated data.  There are two ways to do this.  One is to copy the Production data but de-identify it so you cannot know who an individual is.  The other more common way is to create business intelligence data models and copy in only those fields you aggregate on.  So in my example above, you would not have the person’s name, but a routine that guesses ethnicity or there’s probably an actual data field that captures this.  So you migrate only the data you need.  The result is data you can do all kinds of querying on without knowing a single individual.  Thus my birth date is part of the aggregated data, but no one knows it is my birth date.  They simply know someone was born on that date.  Social security numbers have no use for aggregation as far as I know, so would not be copied over.

 

The point to all this is that analytics, fraud detection, and audits can be done without giving a single human being on planet earth full access to all of the data.  It is never granted.  In my years working on stuff like this I have never once had access to all Production data and certainly never read/write.  I have had access to individual records to analyze why there was an error but that’s it.

 

Production systems and databases is granted, but this is strictly monitored.  You only get access to production for two reasons: to implement a change, or to fix a problem.  In either case, what you do and when you do it is strictly monitored and audited.  Having access to all data is never permitted, not even read-only access. 

 

When a change is made to production, you go through the Change Advisory Board (CAB) before anything gets done.  Each change is specifically detailed.  The exact time you make the change is detailed.  Who makes what change and in what order is detailed.  Error checking is detailed along with a rollback plan and possible shakedown testing after Production has been changed.  In this case a DBA does have access to the Production database, but what he is authorized to do is spelled out in excruciating detail and there are people on hand to monitor so that we can make sure what is in the change request goes in exactly.  If you do not include all of these things in your change request, the CAB will reject your change.

 

CAB is a pain in the butt.  I hated everyone of them.  But they are absolutely necessary.

 

So, when Elon Musk and the GOP tell you that read/write access to all data is required to search for fraud, they are lying.  Banks and other institutions have been fighting fraud exactly as I outlined for decades without granting anyone full access to all of the data.  IT professionals take data security and privacy seriously.  One screw up can result in getting fired, and the company paying big fines.  The American people are being conned again and again by Trump and Musk.

 

NOTE:  As part of the Whitewater investigation of the Clintons, the IRS ran a Benford’s Law test over all of the Clinton transactions.  The Clintons passed.  What that means is that filter showed there was no fraud or wrongdoing.  It’s not the only filter that is used, but since I mentioned it above, I thought I would add in this tidbit.  

0 Comments:

Post a Comment

<< Home