BASIQ 2025

Leveraging Web-scraping for Tourism Data Analysis: A Case Study on Romania

Authors: Cristina Rodica Boboc, Ana Maria Babaligea, Simona Ioana Ghiță, Andreea Simona Săseanu

Conference
BASIQ International Conference on New Trends in Sustainable Business and Consumption
Year
2025
Section
Digital transformation and emerging new technologies challenges
Paper code
25020
DOI
10.24818/BASIQ/2025/11/020
PDF
https://conference.ase.ro/papers/2025/25020.pdf

Abstract

In the digitalization era, the availability and accessibility of data have experienced a significant increase, opening up new opportunities for the analysis of the tourism sector—an area where the primary source of data for performance evaluation has traditionally been official statistics. This paper aims to investigate the potential of using an alternative data source, namely web scraping, by emphasizing the additional insights and advantages that this method can offer in comparison to conventional statistical data. The information was collected through web scraping from an online tourism booking platform, using a specially developed program written in the Python programming language. This data is employed to conduct an in-depth analysis both at the national level and from a territorial perspective, examining the types and quality of Romania’s tourism supply. Thus, the analysis focuses on the number and types of accommodation establishments, the average price per room for a one-night stay during the peak summer season, as well as on the reviews provided by tourists for the accommodation units. Based on the results of this analysis, a set of recommendations is formulated to support the enhancement of Romania's tourism supply. Authorities in less developed tourist areas can boost investment through incentives and promote diverse, authentic accommodations and tourism types, while improving visibility and modernizing existing facilities.