Select Page

Keynote speech of Pierluigi Coppola at Data Mobility 2025

Transforming big data into mobility services.

Examples, models and practical applications

Big data has now taken over much of the debate on innovative methods of data collection and analysis in the field of mobility, but what are the concrete applications that allow it to be used for new mobility policies and transport services? In his keynote speech at the Data Mobility Summit 2025, Pierluigi Coppola, full professor of Transport Planning at the Politecnico di Milano, took us on a journey through the new frontiers of demand modeling and data analysis, showing how the integration of technology, machine learning, and knowledge of social phenomena can transform raw data into innovative sustainable mobility practices. From Smart Apps to activity-based models and nudging for behavioral change, his presentation invites us to rethink strategies for collecting, validating, and using data, and to invest in tools and skills capable of grasping the complexity of reality.

Navigating the world of big data

Big data now affects various sectors of urban and extra-urban mobility, integrating with and replacing traditional techniques for detecting and observing urban mobility phenomena. From smart cards to video cameras or thermal cameras for monitoring people flows in public spaces, to more sophisticated techniques that include the collection of telephone data and communication between vehicles and smartphones, enabling the development of advanced models, innovative policies, and new mobility services.

But how can we classify the vast universe of big data? A first distinction is between aggregated data and disaggregated data.

Aggregated big data mainly concerns traffic flows, detected using technologies that have now replaced traditional roadside traffic counts: smart cards, FCD (floating car data from monitoring devices installed in cars for insurance purposes), and telephone data. “For privacy reasons, this data can only be used in aggregate form: when the number of observations sharing the same origin and destination falls below a minimum threshold, masking is used to obscure the data so that it cannot be traced back to a single individual.” This data allows for advanced static and dynamic analysis, as well as clustering analysis, vehicle flow and trajectory analysis, which are very useful for understanding mobility trends over time (e.g., in a day) and updating O/D matrices.

Disaggregated data, on the other hand, allows for the study of mobility behaviors: “Traditional surveys based on user questionnaires are now being replaced by smart apps and travel diaries, which can automatically and directly monitor people’s behavior.” Not only the origin and destination of trips, but also the duration of stops, the mode of transport used, or the activity at the destination, thanks to sophisticated algorithms for the automatic recognition of these aspects.

Smart applications are in turn divided into supervised and unsupervised, depending on the type of interaction with the smartphone owner (whether direct or absent).

The supervised application, i.e., with direct interaction, while having several advantages, has costs due to the need for validation and action by the user. On the other hand, today, through machine learning and clustering algorithms, unsupervised applications are able to obtain individual information without resorting to interaction with the user. “For example, hierarchical clustering and DBSCAN techniques can be used to derive the reason for travel from the characteristics of the destination locations in the study area, the frequency of travel, and the duration of the stop. Other applications are able to automatically recognize the mode of travel based, for example, on the speed of movement.“ In all these cases, we are talking about ”enhanced unsupervised learning,” i.e., techniques based on machine learning of recurring data, even anonymous, from which information about the individual and how they are moving can be gleaned.

Let’s get into the details: Supervised Learning

According to Prof. Coppola, the frontier is represented by supervised learning, i.e., applications that allow for validation feedback from the user on the characteristics of the trip estimated through algorithms, such as the mode of transport, the reason for the trip, or the activities carried out at the destination; this allows the mobility data to be associated with a user profile.

This type of approach encounters two types of barriers: on the one hand, people’s resistance to giving their consent to the tracking of their movements for prolonged periods of time; on the other, the need for the app to be constantly active on the phone and connected to the network.

But what is the potential of this method? First and foremost, supervised learning through direct interaction with users allows information on daily mobility (“travel diaries”) to be collected, enabling the development of activity-based models, “a modeling approach that originated in the 1980s for research purposes but was never actually applied due to the difficulty of obtaining sufficient data. Today, these models are becoming relevant again thanks to the availability of travel diaries at relatively low cost.”

Activity-based models are more advanced demand models than traditional four-stage models and allow for the simulation of daily mobility of individuals by evaluating the entire sequence of activities performed by the user during the day and not just referring to a single origin-destination trip. How do they do this? By acquiring information on the location, duration, and characteristics of trips between different daily activities. These models therefore make it possible to reconstruct daily trip chains and the mobility needs that generate them, simulating the transport demand that derives from the various activities planned and carried out by individuals.

The activity-based approach: experiments and difficulties

Examples of activity-based models are being developed at the Politecnico di Milano and concern the frequency of daily trips (trip frequency models) or the sequence of primary and secondary activities and the mode of transport used between one activity and another.

Link to the study

The engagement of a sufficiently large sample of individuals willing to install a background app that monitors their daily activities can be a critical issue. How can this reluctance be overcome? “In many cases, monetary incentives and rewards are used, but this is not always sufficient. In an experiment conducted with students at the Politecnico di Milano, despite the incentive of a €50 shopping voucher, there was an 80% drop in participation between those who were contacted to participate in the experiment and those who actually installed the app and used it.“ Over time, participation also undergoes a ”physiological decline” and tends to decrease.

The importance of data processing

Depending on the type of analysis and model to be developed, data can be manipulated, used, interpreted, and processed differently. It is therefore essential to build a database that is accurate and useful for the intended purposes. If the goal is to develop activity-based models, all “inner loop trips,” i.e., short trips within the vicinity of the residence, must be discarded. When studying the mobility behaviors of pedestrians and cyclists, however, this data becomes essential.

Much of the work depends on how the raw data is processed: “An often overlooked aspect that is becoming increasingly relevant, and which depends heavily on the purpose of the analysis.” Data processing involves correcting the data, especially when it is not validated or supervised by the user. An example is shown in the image below, where a journey between Brescia and Milan made by train is interpreted as a journey by motorway because it is assigned to the extra-urban route.

In addition, it is often necessary to reduce the complexity of travel diaries: monitoring too many activities would lead to a number of combinations that would be difficult to manage.

Real-time data & nudging: a winning combination

Another element of big data classification (particularly data from Smart Apps) concerns availability over time, i.e., whether the data acquired is available ex post and real time. The latter can be used today not only to perform real-time analysis and thus implement demand control policies, but also and above all to develop and test innovative policies.

Examples include nudging and gamification, ‘i.e., strategies implemented to encourage users with a ‘gentle nudge’ towards more sustainable mobility behaviors, rewarding them with gadgets, discount vouchers, monetary incentives, or through the stimulus of competition’ (we discussed this here).

In these cases too, the provision of effective incentives is crucial. An ongoing study at the Politecnico di Milano has shown, on the one hand, that without incentives, participation in the experiment is low, but, on the other hand, nudging and gamification can be effective in stimulating a shift towards more sustainable mobility behaviors, such as a greater tendency to walk and cycle. “These results are in line with the literature: without monetary or significant incentives, theimpact is poor: resources are needed to implement nudging policies.”. .

Other innovative policies: crowd shipping

Another example of innovative policies based on real-time data is crowd shipping. This concept is gaining ground in northern European countries and is based on entrusting last mile deliveries (small packages) to travelers themselves. Users agree to pick up the package at a locker point and deliver it to the recipient by making small detours from their usual route, in exchange for compensation.

A study by the Politecnico di Milano aims to estimate the willingness of university students to participate in this type of practice, using the lockers available on the university campus. “It is estimated that the ‘willingness to work’ as a crowd shipper is around €10/hour. Participation and availability depend greatly on the reason for the trip and the urgency of reaching the destination.”

What’s next?

In conclusion, supervised Smart Apps represent a new frontier for data collection aimed at mobility analysis and the design of innovative policies. However, it is essential to improve data processing and the transformation of raw data into a database that can be used for analysis and the development of advanced models.

Research in this area is focusing on improving data collection systems and developing algorithms—including artificial intelligence—to better identify movements. However, the methods of engagement and participation to promote sustainable mobility through the use of these applications still need to be improved and explored further. “This is an issue that needs to be addressed with appropriate measures that encourage participation in these trials and their effective use in everyday practice.”

 Subscribe to our newsletter to follow our activities and access special content.

©2025 GO-Mobility s.r.l. | Partita IVA 11257581006