The QoE of multimedia contents is defined by many factors, including perceptual audiovisual quality, levels of expectations when using a system, and novelty effect, making it hard to model in a comprehensive way. The introduction of more degrees of freedom, which is to be expected when experiencing immersive media, further convolutes the issue.
Measuring the QoE in immersive scenarios allows us to understand which factors play a role in the human perception of multidimensional contents. The goal is to build models that can factor the intrinsic distortions of the content under exam, placed in the larger context of position and function in the virtual space, its importance in the task at hand, and how it interacts with previous expectations, beliefs, and experiences of the final user.
Lee, S., Viola, I., Rossi, S., Guo, Z., Reimat, I., Lawicka, K., Striner, A., and Cesar, P., 2024. Designing and Evaluating a VR Lobby for a Socially Enriching Remote Opera Watching Experience. IEEE Transactions on Visualization and Computer Graphics.
Gutierrez, J., Perez, P., Orduna, M., Singla, A., Cortes, C., Mazumdar, P., Viola, I., Brunnström, K., Battisti, F., Cieplińska, N. and Juszka, D., 2021. Subjective Evaluation of Visual Quality and Simulator Sickness of Short 360 Videos: ITU-T Rec. P. 919. IEEE Transactions on Multimedia, 24, pp.3087-3100.
Subramanyam, S., Li, J., Viola, I. and Cesar, P., 2020, March. Comparing the quality of highly realistic digital humans in 3DoF and 6DoF: A volumetric video case study. In 2020 IEEE Conference on Virtual Reality and 3D User Interfaces (VR) (pp. 127-136). IEEE.
Alexiou, E., Viola, I., Borges, T.M., Fonseca, T.A., de Queiroz, R.L. and Ebrahimi, T., 2019. A comprehensive study of the rate-distortion performance in MPEG point cloud compression. APSIPA Transactions on Signal and Information Processing, 8.
Viola, I., Řeřábek, M. and Ebrahimi, T., 2017. Comparison and evaluation of light field image coding approaches. IEEE Journal of selected topics in signal processing, 11(7), pp.1092-1106.
Direct measurements of QoE through user studies are commonly considered as ground-truth information regarding the perceptual merit of distorted contents. However, they are cumbersome and expensive to execute. Thus, great effort has been spent in the literature in order to create algorithmic solutions that can mimic and predict users' perception.
The goal is to build QoE models that holistically consider the intrinsic distortions of the content, as well as the rendering parameters, lighting effects, and position in the virtual space. Such models will help optimize acquisition, transmission and rendering in an end-to-end system considering all its components, and fine-tuning each part in order to achieve the best possible results.
Zhou, X., Alexiou, E., Viola, I., and Cesar, P., 2023, October. PointPCA+: Extending PointPCA objective quality assessment metric. In 2023 IEEE International Conference on Image Processing (ICIP). IEEE
Smitskamp, S., Viola, I., and Cesar, P.. 2023, June. Evaluation of point cloud features for no-reference visual quality assessment. In 2023 Fifteenth International Conference on Quality of Multimedia Experience (QoMEX) (pp. 1-6). IEEE.
Viola, I. and Cesar, P., 2020. A reduced reference metric for visual quality evaluation of point cloud contents. In IEEE Signal Processing Letters.
Viola, I., Subramanyam, S. and Cesar, P., 2020, May. A color-based objective quality metric for point cloud contents. In 2020 Twelfth International Conference on Quality of Multimedia Experience (QoMEX) (pp. 1-6). IEEE.
Immersive media systems have provoked a paradigm shift in how contents are consumed: while traditional video is passively consumed as-is, with immersive content, users can actively decide where to look and where to direct their attention. Thus, being able to model user behavior when navigating XR scenes with 6 Degrees of Freedom (6DoF) can lead to sensible reductions in network and computational resources consumption, while limiting the impact on the perceived quality.
However, modeling user behavior in 6DoF presents several challenges, as current solutions do not easily adapt to immersive scenarios where several factors can contribute to the final content that is displayed for the user. For example, if two users are looking at the same content from two different distances, they will visualize different levels of details, thus significantly changing their consequent behavior.
I am interested in creating models for user behavior in 6DoF that can take into account the visual saliency of the immersive content under display, as well as the similarities between users' predispositions, to effectively explain how users move and interact with immersive media.
Zhou, X., Viola, I., Alexiou, E., Jansen, J., and Cesar, P., 2023, October. QAVA-DPC: Eye-Tracking Based Quality Assessment and Visual Attention Dataset for Dynamic Point Cloud in 6 DoF. In Proceedings of the IEEE International Symposium on Mixed and Augmented Reality (ISMAR). IEEE
Rossi, S., Viola, I., Toni, L., and Cesar, P. 2023, June. Extending 3-DoF Metrics to Model User Behaviour Similarity in 6-DoF Immersive Applications. In Proceedings of the 14th ACM Multimedia Systems Conference (MMSys ’23). Association for Computing Machinery, New York, NY, USA.
Rossi, S., Viola, I., Toni, L., and Cesar, P.. 2021. A new Challenge: Behavioural analysis of 6-DoF user when consuming immersive media. In In 2021 IEEE International Conference on Image Processing (ICIP). IEEE.
Advances in telecommunication systems and innovative network solutions indicate that transmission of immersive media contents will become more widespread in the future, as bandwidth capacity becomes progressively larger. However, since large volumes of data, several orders of magnitude larger than traditional image and video contents, are involved when using immersive acquisition, new algorithmic solutions are needed to efficiently reduce and transmit the data while maintaining the desired perceptual quality.
I am interested in designing and evaluating new solutions for immersive media that can be suitable for real-time, peer-to-peer transmission systems while providing the best possible quality of experience for the users. In particular, considering the use case of real-time communication in XR, core systems aspects such as low latency, low complexity at the encoder and decoder side, and limited power consumption should be combined with faithful self-representation in the spatial and temporal domain, saliency-driven partitioning and compression, and user-adaptive encoding and transmission.
Viola, I., Jansen, J., Subramanyam, S., Reimat, I. and Cesar, P., 2023. VR2Gather: A collaborative social VR system for adaptive multi-party real-time communication. IEEE MultiMedia.
Subramanyam, S., Viola, I., Jansen, J., Alexiou, E., Hanjalic, A. and Cesar, P., 2022, October. Evaluating the Impact of Tiled User-Adaptive Real-Time Point Cloud Streaming on VR Remote Communication. In Proceedings of the 30th ACM International Conference on Multimedia (pp. 3094-3103).
Subramanyam, S., Viola, I., Hanjalic, A. and Cesar, P., 2020. User-centered Adaptive Streaming of Dynamic Point Clouds with Low Complexity Tiling. In Proceedings of the 28th ACM International Conference on Multimedia (MM ’20). Association for Computing Machinery, New York, NY, USA.
Viola, I., Maretic, H.P., Frossard, P. and Ebrahimi, T., 2018, September. A graph learning approach for light field image compression. In Applications of Digital Image Processing XLI (Vol. 10752, p. 107520E). International Society for Optics and Photonics. Github page