Datd - Image credit: Alpha Stock Images - link to - http://alphastockimages.com/

Love Data ugen - dag 3 - Læs om "genbrug" af data og om Open path-projektet

Så er vi halvvejs i Love Data ugen - og der er flere historier at dele ...

Forskningsdata på DTU:


“What motivates me to be a FAIR ambassador is a possibility to disseminate principles that will enhance the transparency and traceability of the work in the data lifecycle and boost the importance of the role of data creators in the research environment.”

Nikola Vasiljevic, researcher at DTU Wind Energy and FAIR ambassador

Historier om data:

 
Genbrug af "open data" inden for "Engineering Design"

Øget fokus på intregritet og gennemsigtighed inden for forskning gør begrebet 'åbne data', ‘open data’, mere og mere relevant. Inden for visse forskningsområder, for eksempel inden for ingeniørdisciplinerne, er det ikke så almindeligt at stille data til rådighed eller at genbruge data, som det for eksempel er inden for bioinformatik, fysik og computervidenskab.

Vi har talt med Pedro Parraguez, postdoc på DTU Management Engineering, som har undersøgt anvendelsen og konsekvenserne af at genbruge data, der allerede er tilgængelige inden for Engineering Design. Han har for nylig offentliggjort artiklen “Data-driven engineering design research: Opportunities using open data” (ISSN: 22204334), og vi spurgte ham, hvorfor åbne data ikke er en del af almindelig praksis i dette forskningsområde?

What is Engineering Design and what type of data do you use in this research discipline?
Engineering Design is a discipline that studies how we design services, products and more generally, systems of an engineering nature. For example, it can be from the design of a very small component part of a machine, all the way up to a large system like a bridge or a space shuttle.

The type of data that we use is as diverse as the research area. It can be technical data about measurements or data about material resistance. We also use data about the people who are involved in the process of designing an artifact.

From the title of your publication, one can infer that ‘open data’ and the re-use of open data is not common in Engineering Design? If that is correct, why?
When the discipline started, most data was not available by default. In the 70s and 80s there was no digital trace widely available about what was going on during the design process so you had to gather  the data yourself for each new study. Getting the data was a kind of handcrafted process.

Nowadays, the main constraint is that while these data might already exist, they are usually not collected for the purpose of research, and they are the property of the company that generated it. As a result, data tends to be proprietary and the researcher is allowed to collect it only after having obtained the permission of the company. This means that there are many constraints related with privacy and confidentiality that you need to navigate and respect. Many times, we are simply not allowed to make the data publicly available or if we are, we normally need to anonymize it heavily.

However, it is becoming increasingly possible to creatively exploit new open data generated in other contexts that is also relevant for Engineering Design research. For example, developing open source software is an engineering design activity because there is a design process in creating this software and in many cases, you can collect digital traces of the GitHub repositories that are publicly available. Also, there are communities of people that design objects through 3D printing and sometimes the data is available online in various degrees of quality, but at least you might be able to get the data. You have patents and a number of other sources that you can also creatively combine and use in Engineering Design.

What would be the benefit of using open data in your research area?
All disciplines are under a lot of pressure to be as transparent as possible and to enable everybody to check if what you are reporting is correct.  Issues related to replicability, validity and reliability are very important in any discipline. In disciplines like Engineering Design, there is an extra challenge (of privacy and confidentiality) that I mentioned before where achieving transparency is usually not an easy task. That is why using open data creatively can allow us to move the discipline towards easier ways of checking validity and reliability. It does not mean that we will ever get to be 100% open, but at least we can move in that direction.

Pedro Parraguez Ruiz, Postdoc,
Engineering Systems division, DTU Management Engineering,
ppru@dtu.dk, ORCID: 0000-0002-0017-4057

Vi er data:

 
‘Open path projektet’

Det er en gammel nyhed, at enorme mængder af private data indsamles og opbevares af forskellige virksomheder. Geografiske data fra vores mobiltelefoner er blot et eksempel - men hvorfor ikke ordne, visualisere og bruge disse data selv?

Hvis du leder efter datasæt til at udforske datavidenskaben og afprøve de tilgængelige værktøjer, så tjek dette interessante projekt fra New York Times Labs (@NYTLabs) kaldet openpaths.cc

“Using our mobile apps you can track your location, visualize where you've been, and upload your data to the OpenPaths website. You can then download your data from the website in a variety of friendly formats, including KML, JSON, and CSV.” (Kilde: website openpaths)

 

Love Data Week - 3