Terry Milholland is the Chief Technology Officer for the Internal Revenue Service. Before coming to the IRS, he had a long career as a technology leader in the private sector at companies such as Visa, EDS and Boeing. I asked Terry to discuss his experiences at IRS and emerging trends on the use of data in government.
Castro: While “Big Data” is new for some organizations, the IRS has been managing large amounts of data for years. But the technology keeps changing. What types of data analytic capabilities are available to you now that were not available a five or ten years ago?
Milholland: The biggest difference between today and five years ago is the expansion of capabilities to handle pools of data that are getting dramatically larger. Aggregation techniques have been developed, and new analysis methods are getting dramatically faster, while at the same time gotten easier to use. The technology like massive parallel processing today allows you to do trending over more dimensions and more years in a more timely manner than was previously possible. Because the technology can produce faster results, you get more effective use of your software algorithms for analytics. While the capabilities have grown, for the most part the lure of “what-if” analysis was unrealistic unless one was willing to execute the query and come back hours or even days later to see the results. The speed now makes what-if analysis more practical. Since the user interfaces for analytical software is now more browser based, web oriented, or drag and drop, the audience community for analytics has grown tremendously over the past five years. A larger audience to view the data and share results with each other via social network technology helps to crowd source results, that to some degree leverages the analytics. .
Castro: What kind of benefits should taxpayers expect from IT investments by the IRS?
Milholland: Taxpayers are already beginning to reap the benefits of our focus on IT investments. One major investment was the establishment of a foundation for a sole authority data base for all taxpayers, starting with a robust data model for every data element that the IRS needs for its mission. Together with changes on how we process, more taxpayers receive their refunds faster now that we process returns daily instead of weekly, and their accounts and payments are uploaded and updated much more timely. The aggregation of taxpayer information from multiple sources is shorter now. Our taxpayer assistors can view taxpayer account information within two days of posting, whereas in the past this might take three weeks. This helps to resolve issues, including types of refund fraud, more quickly. These are just a few examples of taxpayer benefits that have rolled out over the past year.
Castro: What opportunities do you see for more use of data analytics at IRS in the future?
Milholland: The reason to conduct analytics is to find nuggets of information that you can take action against. We don’t collect terabytes and petabytes of information to be locked up and only rarely accessed by the most sophisticated programs. In the past, the questions you could ask about the data were limited by the fields into which the data was entered in the first place. In contrast, there are new approaches that are more forgiving in how they combine different types of data. As a result, IRS analysts can ask questions that might not have even occurred to them when we were first setting up the database. We are just tapping into the opportunity to merge more of our structured data (think of the tax return you file every year) with our less structured data (think of call logs) to create a more complete picture of the trends we seek. As the cost of storing data continues to drop, the notion of how much data we should keep online is also changing. This coupled with the new approaches I just mentioned opens up creative ways of analyzing the data. I also think our analytics will move more into a rapid visual representation of information to learn and discover the data. We also have an opportunity to extend analytics to share results across internal stakeholders to take advantage of the wisdom of the crowds.
Castro: What kinds of challenges do you see on the horizon for more effective use of Big Data in federal agencies?
Milholland: I’ve always thought the term “Big Data” would be more appropriately labeled as “Big Insights.” That really is the goal—to gain insights. The challenge the federal government has with data is the same whether it is big data, small data, old data or new data. Data by itself is not very interesting. It is when you combine data with other data that puts the data in context and makes for a much more interesting usage. The integration of data remains a key challenge. Even with standards in place to exchange the data, the data will come in different formats, granularity, frequency, keys, latency, etc., requiring each circumstance to be analyzed. Connecting all the dots between agencies or from any external party is still more art than science.
Castro: During your tenure at the IRS, you have presided over some major IT modernization projects. What lessons would you offer to others in the federal sector facing similar challenges?
Milholland: First, start with the end in mind but tackle in small chunks. At the IRS, data is our crown jewel. We don’t make the tax law and any internal application we create is worthless without the data. So understanding that data drives our success was important to establish up front. We have spent a considerable amount of time educating and training our staff on what it means to be data-centric. So whatever your end state is, be consistent on your messaging so all efforts will support that key driver. That will help ensure everyone is thinking about the end state, which ultimately helps avoid rework.
Second, embrace the iterative process. Modernization is often not a technology challenge. More often it is a cultural shift that takes momentum, which starts with a small success, to show people that it can be done a different way. Different doesn’t mean the other way was right or wrong, but just different. Different so that whatever your challenge is, you are better positioned to be successful in the future.
Third lesson is to be transparent. The federal government has many oversight communities and it is easier to bring them along with you than to try to explain things after the fact. Lastly, don’t be afraid to jump generations of technologies but have tolerance for some failures. We all learn from our failures and the idea is to learn what not to do as early as possible before you have invested much money. Prototyping is a good way to test ideas and strategies that seem good on paper but need a practical application before they can be determined a good fit for your organization.
“5 Q’s on Data Innovation” is part of an ongoing series of interviews for Data Innovation Day by ITIF Senior Analyst Daniel Castro. If you have a suggestion for someone who should be featured, send an email to Daniel Castro at firstname.lastname@example.org.