big data and operations research

Sheldon Jacobson and Edwin Romeijn, the OR and SES/MES program directors at NSF, respectively, talked about the role of operations research in the bigger picture of scientific research at the INFORMS Computing Society Conference in Santa Fe last week. Quite often, program managers at funding agencies dole out advice on how to get funded. This is useful, but it doesn’t answer the more fundamental question of why they can only fund so many projects?

Sheldon and Edwin answered this question by noting that OR competes for scientific research dollars with every other scientific discipline. One way to both improve our funding rates and to give back to our field is to make a case for how operations research should get a bigger slice of the research funding pie.

Sheldon specifically mentioned OR’s role in “big data.” Most of us work or do research where data plays an integral role, and it seems like this is a great opportunity for our field. I’ve been thinking about the difference between “data” and “big data” in terms of operations research. Big data was a popular term in 2012 despite how there is no good definition of how “big” or diverse the data must be before the data become “big data.” NSF had a call for proposals for core techniques for big data. The call summarized how they define big data:

The phrase “big data” in this solicitation refers to large, diverse, complex, longitudinal, and/or distributed data sets generated from instruments, sensors, Internet transactions, email, video, click streams, and/or all other digital sources available today and in the future.

I like this definition of big data, since it acknowledges that the challenges do not only lie in the size of the data; complex data in multiple formats and data that changes rapidly is also included.

I ultimately decided not to write a proposal for this solicitation, but I did earmark it as something to think about for the future. This call required that the innovation needed to be on the big data side, meaning that projects that utilize big data in new applications would not be funded. Certainly, OR models and methods benefit from a data-rich environment, since it leads to new OR models and methods. Here, data is mainly used as a starting point from which to explore new areas. But this means that there is no innovation on the Big Data side. Instead, the innovation will be on the OR side. Does big data in OR mean that we will continue to do what we have been doing well, just with bigger data?

This is an open question for our field: how will bid data fundamentally change what we do in operations research? 

My previous post on whether analytics is necessarily data-driven and whether analytics includes optimization can be viewed as a step towards an answer to this question. But I’m not close to coming up with an answer to this question. Please let me know what you think.


7 responses to “big data and operations research

  • Mathematical Seer (@MathSeer)

    You can indeed bid that big data will remove the lousy math from OR. Analytics which include optimization are, to my taste, employing the wrong math tools. I believe that successful predictive analytics will more and more employ data-driven methods based on extracting the proper information.

  • Alex Mills

    Big data is defined when you need a NoSQL database management system to manage the data when a traditional RDBMS because unwieldy

    I disagree with “I like this definition of big data, since it acknowledges that the challenges do not only lie in the size of the data; complex data in multiple formats and data that changes rapidly is also included.” Big data is by definition large quantities of data at once, if data streams in and can then be discarded, that doesn’t count.

  • Laura McLay

    Alex, I think in all cases, the data is “big” – but big data that is complex and has a high velocity is in many ways more challenging to deal with than “bigger” data that is not changing and in a single format. But it’s hard to be precise here. I didn’t state that clearly.

    But I like your definition, too. When you cannot open data using regular data processing software, you know it’s big (:

  • j_rosenbe

    I have talked to Edwin at length but also Sheldon and Dane Skow to some degree about this. The original term “big data” was about analytics, such a Money Ball, Nate Silver, etc. This is still the way the private sector uses the term too. Check out just about any post from ficolabsblog.fico.com for example. (Full disclosure: My dad works at FICO.) However, the CS database folks were very clever and more or less redefined big data at NSF to refer to storage, not analytics. In my mind, this was a very clever coup, because as a buzz term, “big data” is hot.

    I can say personally I have shifted my research topics to include more statistical/predictive modeling as a component (though generally not the primary one) of my research. So for example, suppose someone had a set of statistical models to describe a system and now wants to optimize it. Most statisticians will shove these models into a black box heuristic such as a genetic algorithm. However, mathematical programmers have known for years that there are better ways to optimize by exploiting special structures of the models. Developing these better methods has become the focus of most my research topics these days.

  • Ahmet

    After analyzing big data, we can extract new properties of the system and define new objectives.

  • F Marco-Serrano

    I make a note on that def too: can’t process it with conventional software, then it’s big. However, that would create false positives since big datasets doesn’t necessarily mean big data.

  • The dice is rolling

    […] January I read from Laura’s blog (Punk Rock OR) she had started to have a look at it. In March I was helping a colleague to proceed with some […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: