Schedule

Detailed information about the activities

Statistics for data science (Dan Nicolae)

  • Foundations of data analysis
  • Statistical inference with resampling methods
  • Probability and simulations

Machine learning (Dan Nicolae)

  • Linear models and inference
  • Model complexity
  • Prediction and classification
  • Neural networks

Large Language Models (LLMs) – reasoning capabilities and model calibration (Cornelia Caragea)

  • Prompting strategies in LLMs – Zero-Shot vs. In-Context Learning
  • LLMs reasoning capabilities
  • LLMs calibration – do they know what they do not know?

Knowledge graphs (Dumitru Roman and Roberto Avogadro)

  • Intro to graph data structure
  • Knowledge Graphs
  • Graph data management (graph databases with Noe4j, graph data model, graph construction and querying)

LLMs and Agentic AI (Ioan Toma)

  • Introduction to Agentic AI
  • Agent Frameworks

Conversational AI (Ioan Toma)

  • Conversational AI setup and designing a chatbot interface
  • Semantic Knowledge Graphs and their role in Conversational AI
  • Building a chatbot using Onlim Conversational AI framework

Time series analysis and forecasting (Jože Rožanec)

  • Introduction to time series
  • Analysis tools and real-world examples
  • Time series forecasting

Time series: Forecasting, XAI, and databases (Jože Rožanec)

  • Using network models to represent and forecast time series
  • Introduction to explainability methods
  • Introduction to time series databases

High performance data processing (Radu Prodan)

  • Parallel computing architectures
  • Multiprocessing
  • Parallel algorithms
  • Parallel computing for AI and data science

Data/AI pipelines (Nikolay Nikolov)

  • Introduction to data/AI pipelines
  • Data/AI pipelines using containers

Operationalizing data and AI pipelines (Wiktor Sowinski-Mydlarz)

  • Contemporary data processing
  • GATE Institute Data Platform
  • Alternatives and decisions
  • Pipeline lifecycle

Management of data and AI pipelines (Wiktor Sowinski-Mydlarz)

  • Deployment of data and ML pipelines
  • Orchestration of data and ML pipelines
  • Monitoring of data and ML pipelines

Findable, Accessible, Interoperable, Reusable (FAIR) data (Anna Fensel)

  • Introduction to FAIR data. Examples from agri-food and health domains
  • How to make data FAIR? Open data, closed data and everything in between
  • Research data infrastructures

Best practices in data sharing (Anna Fensel)

  • Legal compliance (GDPR, AI Act, Data Act)
  • Consent, contracts and licenses, empowered with knowledge graphs
  • Incentivising data sharing

Software (preliminary): Software tools/services to be used during the sessions include: