Innovation Files has moved! For ITIF's quick takes, quips, and commentary on the latest in tech policy, go to

All posts by Travis Korte logo

Why Are There No Jobs for Hadoop in the Federal Government?

Hadoop has been the industry standard for scalable data processing applications for several years, so why does a search for “Hadoop” on return zero results?

One reason could be that given the current budget environment, hiring for IT projects might be suspended. The budget is certainly a factor, although it cannot be the only one as jobs for SQL, Java, and even COBOL developers can still be found.

Another reason might be that the federal government is simply contracting out this work. Again, this might explain part of the situation, but if so, it reflects poor planning by government agencies as these skills will be increasingly critical to the federal government given the massive amount of information it collects, stores and processes, and agencies should be cultivating this talent.

A more likely reason is that government agencies have not fully embraced “big data” because government leaders still do not fully understand what it can do or how it can help them operate more efficiently. For example, text mining can be applied to financial fraud detectionresearch paper classificationstudent sentiment analysis and smarter search engines for all

Read the rest

Using Data To Fight Counterfeit Goods

Whether advertised on seedy websites or peddled on Manhattan’s Canal Street, counterfeit goods remain a serious problem for U.S. businesses and consumers. Despite the efforts of companies and government agencies alike, the International Chamber of Commerce estimates that the total value of counterfeit and pirated products produced worldwide could reach $1.5 trillion, or around 2 percent of the global economy, by 2015. Counterfeit goods, such as fake pharmaceuticals, tainted baby formula and substandard tires, present numerous safety and reliability concerns for consumers and organizations around the world and drives prices up for consumers purchasing legitimate goods.

Until now, illicit counterfeiting operations have had the advantage, as they have been able to exploit the Internet and other technologies to market and distribute their goods more efficiently, especially goods produced in developing countries. In addition, the manual monitoring practices that companies have relied on to identify counterfeits is cost-prohibitive on the Internet. However, new opportunities to leverage data and data analytics may shift the balance of power back to legal businesses and law enforcement officials by allowing them to detect, track, prioritize, investigate and report potential counterfeit goods more efficiently than

Read the rest

What Does $3.2M Buy in Open Government?

Last week, the Knight Foundation awarded over $3.2 million to the eight winners of its “Knight News Challenge on Open Gov,” a competition open to non-profits, for-profits and individuals around the world that was designed to “provide new tools and approaches to improve the way people and governments interact.” Below is an overview of the winners and the problems they sought to solve.

Pictured: a still from winner GitMachines’ application video.

The Problems

One of the benefits of a public challenge is the chance to identify problems (or opportunities) in government that might be addressed with existing technology. The winning entries noted the following problems:

  • Government data portals have confusing interfaces. The complexity of government procurement policies and practices stifles competition, especially among small businesses, leading to wasted tax dollars. On top of this, many government procurement websites are difficult to use.
  • Proposed policies suffer from poor public understanding. Much of politics is about pocketbook issues, but with complex legislation it is difficult for voters to know the personal impact of different policy proposals.
  • Court records are not digitally accessible. Federal appellate court and state supreme court
Read the rest

G8 Charter Puts Open Data on International Agenda

Last month’s international G8 summit produced a declaration with new guidelines for a broad range of policy issues. Included in this declaration was a set of recommendations for open data initiatives, known as the Open Data Charter. The charter represents the first time open data principles have been agreed to in an international forum—not to mention possibly the highest-level declaration of any kind to mention the open source code repository website GitHub—and will likely help shape the future role of government in data. Here are the key facts.

The summit

The Group of Eight is a policy forum for the governments of eight of the world’s largest economies (previously with six and seven member states) held annually since 1975. Although the summit will be gradually supplanted by the larger G20, which includes developing economies and non-Western states, G8 remains a bellwether of international policy. This year’s event was held June 17-18 at the Lough Erne Resort in County Fermanagh, Northern Ireland, and focused on tax policy as well as the ongoing Syrian civil war.

The participants

U.K. Prime Minister David Cameron played host to President Barack Obama, German

Read the rest

Data Science Is Not PRISM: In Defense of Analytics

In the wake of the leaks that revealed the National Security Agency’s (NSA’s) PRISM surveillance program, several recent articles have responded with criticism of “big data.” “The advantages of big data could prove to be ephemeral,” author Andre Mouton writes in USA Today, but “the costs…will probably be sticking around.” And Andrew Leonard at Salon directly blames the technology, writing, “By making it economically feasible to extract meaning from the massive streams of data that increasingly define our online existence, [distributed processing platform] Hadoop effectively enabled the surveillance state.”

Pictured: Michael Flowers, civic data icon and Analytics Director of the City of New York’s Office of Policy and Strategic Planning. Photo: DataGotham

But criticizing “big data” itself is a curious thing. In its original form, “big data” was just a catchall term for those technologies—borrowed mostly from statistics and computer science—which still worked on data analysis problems that would overload a typical processor. The connotation of “big” as in “big tobacco” was added retroactively. Many practitioners prefer the broader term “data science” for this very reason: they aren’t members of some kind of shadowy syndicate. They aren’t even in

Read the rest

Patent and Trademark Data: What’s New and What’s Next

The first U.S. patent (above) was granted on July 31, 1790. It was issued to one Samuel Hopkins, for a process to make potash (a chemical used in fertilizer), and it was signed by George Washington himself. The original piece of paper still exists, and its information is logged in the databases of the U.S. Patent and Trademark Office (USPTO).

Since that day in 1790, the USPTO and its antecedents have been diligently collecting data on all of this country’s patent activity. It is a venerable information processing organization, and its objectives of making prior art accessible and encouraging innovation by simplifying the patent-granting process have not changed much over its history. The means it uses to achieve these objectives, on the other hand, have changed dramatically, and although it has made great strides in digitization and electronic filing, the USPTO and its international counterparts stand to benefit greatly from advanced data science initiatives.

The Present

The USPTO houses a wealth of valuable data in its patent library that is critical for businesses, researchers, and local inventors. This information used to be locked up in specially-designated Patent and Trademark

Read the rest

Cicada Tracker and the Future of Citizen Sensing

The massive cicada bloom that spread across the eastern seaboard this spring is winding down, but its end heralds another gradually emerging entity: citizen sensing. The Cicada Tracker—a community data-gathering initiative for documenting the noisy insects’ emergence from their burrows—was a rousing success, and it should encourage data innovators across the country to think about what a few motivated citizens and some commodity hardware can do for their communities.

The tracker was devised at a hackathon by John Keefe, a data journalist working for New York public radio station WNYC. The device was a simple piece of open hardware, consisting of an Arduino microcontroller, a temperature sensor, LEDs, resistors and wiring.

For around $80 and some careful construction, it enabled ordinary folks to measure soil temperature, which is a reliable indicator for exactly when the cicadas will surface. After measuring the temperature, people could then send that data—along with their locations and eventually any cicada sightings—to the WNYC team, who created an interactive map to visualize the emerging swarm.

Harvard’s Nieman Journalism Lab reports that the Cicada Tracker organizers received nearly 1,500 temperature readings and an additional 2,000 sightings,

Read the rest

FedTalks 2013: Highlights and Observations

The FedTalks 2013 conference, held June 12 in Washington, brought together a motley crew of government officials, tech company executives, military contractors and civic IT experts to discuss “how technology and people can change government and our communities.” The speakers, ranging from Senator Mark Warner (D-VA) to famed impostor Frank Abagnale (more on him below) came from similarly broad backgrounds. Here is a quick rundown on some highlights and observations from the conference:

FedTalks, Innovators Listen, a federally-supported platform for civic innovation competitions, came up several times, including in U.S. CIO Steve VanRoekel’s keynote address on increasing government efficiency. The site—itself a public-private partnership with technology competition company ChallengePost—encapsulates a theme that pervaded FedTalks 2013 and that’s particularly relevant in the data science sector: as long as government agencies lack the expertise to design and implement data collection mechanisms and disciplined analytics themselves, they will need to get help from external sources. Acting GSA Administrator Dan Tangherlini made the excellent point that in addition to the value created by the winning entries on and similar platforms, other contestants often generate economic value that dwarfs the prize money being

Read the rest

Highlights From The National Day of Civic Hacking

The atmosphere after a hackathon is usually one of relief and mutual congratulation—“We finally made it,” the participants say, referring both to finishing their programs and reaching the end of the grueling event—but the real work takes place in the weeks and months that follow. That’s when the programmers, designers, and subject matter experts refine their work, hopefully planting the seed for a new business or public service.

Below are four standout projects that emerged from the National Day of Civic Hacking (NDoCH), which took place over the first two days of June in 95 locations around the United States. Besides celebrating their ingenuity, there are some lessons to be learned from each of them.

Spreading success stories in Chicago

In Chicago, an app called TowText lets users know if their car has been towed, and provides the phone number and address of the impound lot. The best part? Because of the City of Chicago has a standardized data-collection policy and a rapidly-updating database for relocated vehicles, TowText users get a message within fifteen minutes of their car being logged.

TowText was created by designer-engineer Tony Webster. Webster notes

Read the rest

Your Friendly Neighborhood Hacker

When local news editors across America received tips that hackers would be gathering in their town over the weekend, they must have been alarmed. The events of the first National Day of Civic Hacking (NDCH) – held June 1-2 in 95 locations around the country – were benign, as anyone who has ever attended a similar meet-up might imagine, but that didn’t stop the flood of references to malware, identity theft and other computer security breaches in the news coverage.

In reality, the mission of the NDCH couldn’t have been more “white hat”:

“The event will bring together citizens, software developers, and entrepreneurs from all over the nation to collaboratively create, build, and invent new solutions using publicly-released data, code and technology to solve challenges relevant to our neighborhoods, our cities, our states and our country.”

This wasn’t the sort of “hacking” that captured the popular imagination in the ‘80s and ‘90s; the NDCH events looked more like community service jamborees, with visits from small-town mayors and a few boxes of free pizza on the tables. The participants weren’t there to break laws, and in fact collaborated with local

Read the rest