Dosežki študijskega programa Informacijske in komunikacijske tehnologije

Achievements of the Information and communication technologies study program

Kemal Alič

Priložnostno omrežno kodiranje v brezžičnih zankastih omrežjih

Opportunistic network coding in wireless mesh networks

Mentor in somentor / Supervisor and Co-Supervisor: Aleš Švigelj

Priča smo skokovitemu porastu brezžičnega internetnega prometa. Fizične zakonitosti brezžičnih komunikacij nam narekujejo racionalno uporabo spektra in nenehno iskanje izboljšave tehnologij. Med slednjimi uporaba omrežnega kodiranja ponuja vrsto rešitev, ki lahko prispevajo k povečanju prenosnih kapacitet. V tej disertaciji priložnostno mrežno kodiranje predlagamo kot neodvisno komunikacijsko plast v komunikacijskem sistemu, ki avtonomno sprejema odločitve o kodiranju paketov. Dober kodirni postopek potrebuje omejitev kodnih možnosti, saj je v praksi kodirnih možnosti preveč, da bi jih lahko kvalitetno ovrednotili. Predlagana nova komunikacijsko plast je zadolženo za priložnostno omrežno kodiranje in je popolnoma neodvisno od višje in nižje ležečih komunikacijskih plasti. Postopek je učinkovit in vnaša malo komunikacijske režije v komunikacijo. V fazi razvoja postopka smo iskali ravnotežje med številom kodiranih paketov in uspešnostjo dekodiranja le-teh na sprejemnih vozliščih.

Wireless Internet traffic is growing at staggering rates. For these reasons there is an ongoing pursue for using radio resources as rationally as possible. Network coding provides strong means in utilising the full network capacity. In this disertation, we look at the opportunistic network coding as an independent communication layer in the communications stack. We argue that in practice, there are too many possibilities to provide reliable estimation on which node pairs provide good coding matches. Hence, the possibilities subset should be made in order to provide a reliable decision process. Thus, we propose a network coding layer completely transparent to layers above and below, namely Bearing Opportunistic Network coding. It is an effective procedure that introduces little overhead to the network. When designing the procedure, we were looking for a balance between the number of coded packets and the decoding success rate.

Grafična predstavitev splošnega primera kodiranja za ustrezen par paketov.
Graphical presentation of general coding case for a matching packet pair.

Cormac Callanan

Evalvacija varovanja podatkov internetnih uporabnikov in zagotavljanja zasebnosti v mobilnih omrežjih s študijami primerov

Evaluation of internet user data protection and privacy provision in mobile networks with case studies

Mentor in somentor / Supervisor and Co-Supervisor: Borka Jerman Blažič

Doktorsko delo raziskuje in ponuja rešitve za preprečevanje zlorabe zasebnosti uporabnikov telekomunikacij pri komuniciranju in uporabi mobilnega interneta. Študija raziskuje razmerja in povezave med stopnjo razvitosti telekomunikacijskega trga, bogastvom države, usposobljenostjo uporabnikov, cenovno dostopnostjo mobilne tehnologije, stopnjo tolerance uporabnikov do državne cenzure vsebin in podobnimi sproženimi grožnjami zasebnosti. s strani različnih akterjev. Rezultati disertacije izhajajo iz analize in evalvacije podatkov zbranih med prebivalci 23 različnih držav, kjer je spoštovanje človekovih pravic zelo nizko ali pa ga sploh ni. Raziskave disertacije je finančno podprla svetovna organizacija Freedom House, ki skrbi za spoštovanje človekovih pravic. V doktorskem delu so analizirane in obravnavane razlike med stopnjami tolerance različnih uporabniških skupnosti glede na kriterije, ki opredeljujejo stopnjo razvoja informacijske družbe v posamezni državi in ravni digitalne usposobljenosti uporabnikov. Za boljše razumevanje širših vprašanj, ki obravnavajo zlorabo zasebnosti, doktorska naloga nudi poglobljeno analizo tehnične zmogljivosti večine blagovnih znamk pametnih telefonov, ki ob obvladovanju tehnologije omogočajo zaščito zasebnost uporabnika. Na podlagi opravljenih raziskav je v doktorskem delu obravnavana in predstavljena razpoložljivost najnovejših orodij za izogibanje blokiranju storitev in podatkov ob uporabi mobilnega interneta, ki ga sproži država prek svojih organov. V doktorskem delu je obravnavan problem »internetnih ograjenih in zaprtih vrtov«, ki se je pojavil kasneje v v času razdrobljenosti interneta , ki je še danes eden od ključnih problemov dostopnosti do informacij in podatkov. Dve znanstveni objavi sta bila objavljena v revijah z IF, eden od njih z neposrednim povabilom Cormacu Callanu, da pripravi članek o tej temi za ERA Forum Europe.

The thesis explores and offer solutions for prevention the telecommunication users from privacy abuse when communicating and using the mobile Internet. The study researches the relationships and associations between the level of the telecommunications market development, the wealth of a country, the user proficiency, the affordability of the mobile technology, the level of user tolerance of state-implemented content censorship, and similar privacy threats initiated by different actors. The results and the findings of the study is based on a comprehensive body of data gathered from the population of 23 different countries where the respect of human rights is very low or is non-existing. The thesis was financially supported by the world organization Freedom House. Differences between the tolerance levels of the various user communities were analyzed and discussed in relation to the criteria which define the level of the information society development in a particular country and the users digital skill levels. For a better understanding of the wider issues addressing privacy abuse the thesis provides in depth analysis about the technical capacity of most smartphone brands enabling protection of the user’s privacy. The availability of most recent circumvention tools against state initiated blocking mobile internet is provided. The thesis addresses the problem of “internet closed gardens” that appeared later in the time of internet fragmentation which is still problem today. Two scientific papers were published in journals with IF, one of them by direct invitation to Cormac Callan to prepare a paper on that topic to ERA Forum Europe.

Ocena učinkovitosti blokiranja za vsako vrsto internetne tehnologije v primerjavi z obstoječo implementirano tehnologijo blokiranja.
Assessment of effectiveness of blocking for each type of internet technology in comparison to blocking technology implemented.

Primož Cigoj

Samodejno odkrivanje ranljivosti za potrebe zagotavljanja varnosti spletnega mesta

An automated vulnerability detection for website security

Mentor in somentor / Supervisor and Co-Supervisor: Borka Jerman Blažič

Rezultat raziskav, predstavljenih v doktorski nalogi, je nova učinkovita metoda in a vtomatsko orodje za prepoznavanje ranljivih spletnih mest, ki so lahka tarča kibernetskih napadov. Zmogljivo in hitro delujoče izvirno orodje Vulnet je bilo ovrednoteno glede na njegovo zmogljivost in čas potreben za pregledati vse spletne strani interneta z aplikacijo WCMS. Vulnet je pregledal stotine milijonov spletnih mest v razumnem času identificiral njihovo ranljivost. Orodje in metoda Vulnet omogočata z dinamičnim pregledovanjem spletnih strani, zgrajenih z aplikacijo WCMS skupaj z uporabljenimi vtičniki, izračun stopnje ranljivosti na podlagi novo zasnovanega inovativnega indikatorja za merjenje izpostavljenosti »nevarnosti«, ki jih zaznavamo kot kibernetske napade. Orodje je omogočilo izvedbo študije o splošnem stanju varnosti spletnega prostora v 30 evropskih držav. Študija in izračunani indikatorji ranljivosti so razkrili razmerje med številom nezaščitenih, ranljivih spletnih strani v določeni državi in v kakšni meri prebivalstvo te države obvlada splet in ima ustrezne digitalne veščine. Ugotovljeno je bilo, da je prisotnost večjega števila ranljivih spletnih strani povezan s pojavom nižje stopnje obvladovana digitalnih veščin med prebivalci obravnavane države. Ugotovljeno je bilo tudi, da je še eden dejavnik, ki vpliva na številu ranljivih spletnih mest v državi. To je dostop do interneta, merjena s stroški/ceno fiksnega dostopa do interneta, normaliziranimi z bruto nacionalnim dohodkom (BDP) te države. Rezultati doktorske naloge so bili objavljeni v dveh znanstvenih revijah z visokim IF.

The reserach presented in the thesis resulted in a new method and tool Vulnet for identifiction of the vulnerable Internet websites being an easy target for cybersecurity attacks. Capable and fast acting new original tool was evaluated fir its capacity by inspecting hundred of millions web sites on the global internet and identifying their vulnerability in a reasonable time. The tool and the method enable by dynamic scans of websites built with WCMS application together with the attached plug-ins to calculate the level of the vuknerability based on inew designed nnovative indicator for measuring the “insecuirity” level. The tool enabled a study to be performed in a search of the overall state of web space security in 30 European countries. The study and the calculated insecurity indicators revealed the relations between the volume of insecure web sites in a country with the level of the country population presence of digital skills. It was found that the higher volume of insecure web sites is related to lower levels of Digital skills among the country population. Another factor that influence the volume of the insecure web sites in a country was found to be affordability of internet access measured with the cost of the fixed access to Internet normalized with the country’s Gross National Income (GNI). The results of the thesis were published in two scientific journals with high IF.

Spremljanje rezultatov Vulneta o ranljivosti spletnih mest v realnem času.
Real-time monitoring of the Vulnet results.

Božidara Cvetković

Polnadzorovano strojno učenje z več modeli za prilagajanje uporabniku

Multi-model semi-supervised learning for personalisation

Mentor in somentor / Supervisor and Co-Supervisor: Mitja Luštrek, Matjaž Gams

Zaradi čedalje večje razširjenosti nosljivih senzorskih naprav so metode strojnega učenja za interpretacijo senzorskih podatkov, ki pomagajo razumeti človeško vedenje, vse bolj pomembne. Vendar pa spričo raznolikosti ljudi modeli, ki so bili naučeni na eni skupini, ne delujejo vedno dobro na podatkih novih ljudi. Ta disertacija se zato ukvarja s problemom prilagajanja takšnih modelov uporabniku s pomočjo polnadzorovanega strojnega učenja. Običajno od novega uporabnika ni težko pridobiti velike količine neoznačenih podatkov. Pristop, predlagan v disertaciji, take podatke obogati z majhno količino označenih podatkov (npr. z aktivnostjo uporabnika), nato pa izvirni model kombinira z novimi označenimi in neoznačenimi podatki, kar bistveno izboljša delovanje pri novem uporabniku. Pristop je bil uspešno preizkušen na enem regresijskem in treh klasifikacijskih problemih, ki zadevajo spremljanje človeških aktivnosti.

Due to proliferation of wearable sensing devices, machine-learning methods to interpret sensor data in order to understand human behaviour are increasingly important. However, due to human diversity, models trained on one group of people may not perform well on data from previously unseen people. This dissertation thus tackles the problem of personalising such models using semi-supervised learning. Large amounts of unlabelled data from a new person are usually readily available. The approach proposed by the dissertation adds to this a small amount of labelled data (e.g., with the person’s activity), and then combines the original model with the newly collected labelled and unlabelled data to significantly improve the performance for the new person. The approach was successfully tested on one regression and three classification problems involving human activity monitoring.

Shema metode polnadzorovanega strojnega učenja Multi-Classifier Adaptive Training (MCAT) za prilagajanje modelov uporabniku.
A schema of the Multi-Classifier Adaptive Training (MCAT) semi-supervsied method for personalising machine-learning models.

Tome Eftimov

Statistična analiza podatkov in obdelava naravnega jezika za prehransko znanost

Statistical Data Analysis and Natural Language Processing for Nutrition Science

Mentor in somentor / Supervisor and Co-Supervisor: Barbara Koroušić Seljak

Eftimova doktorska naloga predstavlja nov pristop za raziskovanje domenskega znanja. Statistični del se osredotoča na pridobivanje robustnih rezultatov za objave, medtem ko del, povezan z obdelavo naravnega jezika (NLP), odkriva in normalizira pomembne informacije za odkrivanje znanja. Eftimov je uvedel statistično metodo za primerjavo eksperimentalnih podatkov, odporno na osamelce in majhne razlike, z uporabo razvrstitvene sheme, ki temelji na celotni porazdelitvi podatkov namesto na eni sami statistiki. Za NLP je predlagal metodo prepoznavanja imenovanih entitet, ki temelji na pravilih in ne zahteva anotiranega korpusa. Izboljšal je podobnost besedilnih nizov, specifičnih za domeno, s pomočjo verjetnostnega modeliranja morfoloških informacij. Statistična metoda, testirana na optimizacijskih podatkih, je pokazala bolj robustne rezultate kot običajni pristopi, zlasti v prisotnosti osamelcev. Za podobnost besedilnih nizov v domeni specifičnih segmentov je bila metoda uporabljena na konceptih hrane in dosegla boljše rezultate kot običajne mere podobnosti nizov.

Eftimov’s PhD thesis presents a novel approach for knowledge exploration within a domain. Its statistical part focuses on obtaining robust results for publications, while the natural language processing (NLP) part extracts and normalizes information to track knowledge. Eftimov introduces a novel statistical method for comparing experimental data, robust to outliers and small differences, using a ranking scheme based on the entire data distribution rather than a single statistic. For NLP, he proposes a rule-based named-entity recognition method that does not require an annotated corpus. He improves domain-specific text string similarity through probability modeling of morphological information. The statistical method, tested with optimization data, has shown more robust results than common approaches, particularly with outliers. For text similarity in domain-specific segments, the method applied to food concepts achieves better results than common string similarity measures.

Globoka statistična primerjava za metahevristične stohastične optimizacijske algoritme.
Deep statistical comparison for meta-heuristic stochastic optimization algorithms.

Hristijan Gjoreski

Kontekstno sklepanje v ambientalni inteligenci

Context-based reasoning in ambient intelligence

Mentor in somentor / Supervisor and Co-Supervisor: Matjaž Gams, Mitja Luštrek

Doktorska disertacija Hristijana Gjoreskega z naslovom “Context-based reasoning in ambient intelligence” prinaša nov in izviren pristop k uporabi kontekstualnih informacij za izboljšanje umetne inteligence in strojnega učenja, zlasti na področjih ambientalne inteligence ter vsepovsod prisotnega in prodornega računalništva. Metodologija CoReAmI, ki kombinira več virov informacij vodeno s kontekstom, je bila uspešno preizkušena v treh različnih študijah primera, kot so prepoznavanje človeških aktivnosti, ocenjevanje porabe energije pri ljudeh in zaznavanje padcev. CoReAmI je v vsakem primeru znatno presegel obstoječe pristope. Raziskava je povezana s tremi evropskimi projekti (Confidence, Chiron in Commodity), izdelki razvitih projektov pa temeljijo na tehnikah, razviti med doktorskim študijem. Praktična uporabnost je dodatno potrjena z ustvarjanjem sistema za prepoznavanje aktivnosti in zaznavanje padcev v realnem času, ki je zmagal na mednarodnem tekmovanju EvAAL 2013. Hristijan je do doktorata objavil 5 člankov, 17 referatov in 2 patentni vlogi. Dobil je nagrado za najboljšega mladega raziskovalca v svoji državi.

Hristijan Gjoreski’s PhD thesis brings a novel and original approach to the use of contextual information to improve artificial intelligence and machine learning, in particular in the fields of ambient intelligence and ubiquitous and pervasive computing. The CoReAmI methodology, which combines multiple sources of context-driven information, has been successfully tested in three different case studies such as human activity recognition, human energy estimation and fall detection. In each case, CoReAmI significantly outperformed existing approaches. The research is linked to three European projects (Confidence, Chiron and Commodity) and the products developed by the projects are based on techniques developed during the PhD studies. The practical applicability is further validated by the creation of a real-time activity recognition and fall detection system, which won the EvAAL 2013 international competition. Hristijan has published 5 articles, 17 papers and 2 patent applications.

/.
/.

Martin Gjoreski

Spoj klasičnega in globokega strojnega učenja za mobilno spremljanje zdravja in obnašanja z nosljivimi senzorji

A fusion of classical and deep machine learning for mobile health and behavior monitoring with wearable sensors

Mentor in somentor / Supervisor and Co-Supervisor: Matjaž Gams, Mitja Luštrek

Dr. Martin Gjoreski se osredotoča na področje elektronskega in mobilnega zdravja ter na ugotavljanje obnašanja z nosljivimi senzorji, kot so zapestnice ali telefoni. Psiho-fizično stanje je težko določiti celo za ljudi, saj na primer iz zvoka v 86 % primerov pravilno uganejo čustva govorca. Dr. Gjoreski je razvil novo računalniško arhitekturo, ki združuje globoke in klasične metode ter preneseno učenje, kar omogoča uporabo znanj, pridobljenih na enem področju, na drugem. Sistem je preizkusil v sedmih različnih domenah: ugotavljanje stresa s kontekstualnimi podatki, ugotavljanje pritiska iz EKG, analiza čustev, ocenjevanje kognitivnega stanja, zaznavanje kroničnega popuščanja srca, distrakcije pri vožnji in načine prevoza s podatki iz mobilnega telefona. V šestih od sedmih domen je dosegel pomembna izboljšanja. Njegovo delo je bilo objavljeno v 10 znanstvenih člankih z visokim faktorjem vpliva, večina v najvišji četrtini, en članek pa je prejel faktor 13 in je ocenjen kot “A”, kar ga uvršča med najboljše v svojem področju. Martin je bil nagrajen z Zlatim znakom IJS.

Dr Martin Gjoreski focuses on electronic and mobile health, and behavioural health through wearable sensors such as wristbands or phones. The psycho-physical state is difficult to determine, even for humans, because, for example, 86% of the time they can correctly guess the emotions of the speaker from the sound. Dr Gjoreski has developed a new computational architecture that combines deep and classical methods with transfer learning, allowing knowledge gained in one area to be applied in another. He has tested the system in seven different domains: stress detection with contextual data, pressure detection from ECGs, emotion analysis, cognitive state assessment, chronic heart failure detection, driving distractions, and modes of transport with mobile phone data. Significant improvements were achieved in six of the seven domains. His work has been published in 10 scientific papers with a high impact factor, most in the top quarter, and one paper with an impact factor of 13 and an ‘A’ rating, making him one of the best in his field. Martin was awarded the IJS Gold Medal.

/.
/.

Giovanni Godena

Modeli programske opreme za vodenje šaržnih procesov

Models of batch process control software

Mentor in somentor / Supervisor and Co-Supervisor: Stanislav Strmčnik

Predstavljeni so pristopi, modeli in rešitve na treh področjih programske opreme za vodenje šaržnih procesov. Na področju abstrakcije receptov je predstavljena tabelarna predstavitev receptov, ki je enostavna za razumevanje s strani operaterjev in za implementacijo na platformi industrijskih krmilnikov. Na področju modela obnašanja entitet postopkovnega vodenja je predstavljen nov model obnašanja, ki v primerjavi z aktualnimi modeli omogoča boljše obvladovanje kompleksnosti razvoja sistemov vodenja šaržnih procesov. Na področju objektnega modela opreme in postopkovnega vodenja je predstavljen nov sofisticiran objektni model, ki temelji na konceptu večkratnega dedovanja oziroma na prekrivajočih se razredih opreme in ki omogoča praktično popolno izogibanje podvajanju informacij v receptih ter zagotavlja visoko stopnjo njihove ponovne uporabe. Vsi omenjeni rezultati so bili implementirani v doma razvitem orodju in uspešno uporabljeni v večjem številu zahtevnih industrijskih projektov.

Approaches, models and solutions in three areas of batch process control software are presented. In the area of recipe abstraction, a tabular presentation of recipes is presented, which is easy for operators to understand and to implement on the industrial controller platform. In the area of the behavior model of procedural control entities, a new behavior model is presented, which, compared to current models, enables better management of the complexity of the development of batch process control systems. In the area of the object model of equipment and procedural control, a new sophisticated object model is presented, which is based on the concept of multiple inheritance, i.e. on overlapping classes of equipment, and which enables practically complete avoidance of duplication of information in recipes and ensures a high degree of their reuse. All the mentioned results were implemented in a domestically developed tool and successfully used in several demanding industrial projects.

Nov model opreme in postopkovnega vodenja šaržnih procesov.
The new model of equipment and procedural control of batch processes.

Miha Grčar

Rudarjenje podatkov v tekstovno obogatenih heterogenih informacijskih omrežjih

Mining text-enriched heterogeneous information networks

Mentor in somentor / Supervisor and Co-Supervisor: Nada Lavrač

Disertacije razišče uporabo strukturnih/relacijskih podatkov za izboljšanje nalog rudarjenja besedila, kot sta klasifikacija in razvrščanje. Glavni prispevek disertacije je metodologija TEHmINE za rudarjenje heterogenih informacijskih omrežij, obogatenih z besedilom. TEHmINE projicira besedila in strukture v skupni vektorski prostor za odkrivanje znanja, ki se uporablja za različne probleme podatkovnega rudarjenja. Primeri uporabe v realnih domenah potrjujejo, da TEHmINe znatno prekaša standardne pristope rudarjenja besedil pri kategorizaciji video predavanj in nalogah ontološkega poizvedovanja. Metodologija OntoBridge, ki predstavlja prilagoditev TEHmINe, izboljša sisteme razvrščanja z integracijo tekstovnih podatkov z relacijsko strukturo.

The thesis explores leveraging structural/relational data to enhance text mining tasks like classification and ranking. The main contribution is the developed TEHmINe methodology for mining text-enriched heterogeneous information networks. TEHmINe projects texts and structures into a common vector space for knowledge discovery, applicable to various data mining problems. Demonstrated in real-world use cases, TEHmINe significantly outperforms standard text mining approaches in video lecture categorization and ontology querying tasks. The OntoBridge methodology, an adaptation of TEHmINe, improves ranking systems by integrating textual data with relational structure.

Pregled metodologije TEHmINe v obliki delotoka.
A workflow-based overview of the TEHmINe methodology.

Gordana Ispirova

Izkoriščanje domenskega znanja pri napovednem učenju iz podatkov o živilih in prehrani

Exploiting domain knowledge in predictive learning from food and nutrition data

Mentor in somentor / Supervisor and Co-Supervisor: Barbara Koroušić Seljak, Tome Eftimov

Disertacija sodi na interdisciplinarno področje strojnega učenja (SU) in prehrane. Osredotoča se na izkoriščanje domenskega znanja, zbranega v virih kot sta FoodEx2 Evropske agencije za varnost hrane in FAO Traffic Light. Novi cevovod SU povezuje znanje pridobljeno iz podatkov z domenskim znanjem, pri čemer se uporablja predstavitveno učenje, nenadzorovano in nadzorovano SU. Cevovod je bil ovrednoten na napovedovanju hranilnih vrednosti iz besedilnih podatkov o receptih. Uvedene domensko specifične vdelave so se izkazale za visoko zmogljive, zato je kandidatka pripravila dva korpusa vdelav za sestavine in recepte. Normalizacijo podatkov je izvedla z uporabo slovarja ter pravil prepoznavanja imenskih entitet in preslikave podatkov v bazo podatkov o sestavi živil iz šestih heterogenih večjezičnih naborov podatkov. Uvedla je tudi pomemben indeks posplošljivosti za oceno stopnje zaupanja v napovedni model naučen na izbranem nizu podatkov za uporabo na drugem naboru podatkov.

Gordana Ispirova’s PhD thesis explores the intersection of machine learning (ML) and nutrition science, focusing on leveraging domain knowledge from EFSA FoodEx2 classification system and the FAO Traffic Light system. The thesis develops a novel ML pipeline that combines representation learning, unsupervised ML, and supervised ML to predict nutrient values from recipe texts. High-performance domain-specific embeddings were created through data normalization, Named Entity Recognition, and mapping to a Food Composition Database from six multilingual recipe datasets. Two corpora of ingredient and recipe embeddings were produced for research and application purposes. Additionally, a generalizability index was introduced to measure the transferability of predictive models between datasets, highlighting the significant impact of data on model performance.

Grafična predstavitev vektorskih reprezentaci, ki upoštevajo domensko specifično hevristiko za vsako skupino receptov z istim imenom.
Visual representation of the vector representations with the domain-specific heuristic of a group of recipes with the same name.

Vito Janko

Prilagajanje senzorskih nastavitev za energijsko učinkovito prepoznavanje konteksta

Adapting sensor settings for energy-efficient context recognition

Mentor in somentor / Supervisor and Co-Supervisor: Mitja Luštrek

Dostopnost prenosnih senzorskih naprav omogoča spremljanje konteksta ljudi (npr. aktivnosti ali vrsto lokacije), kadar je to zaželeno. Vendar so te naprave običajno baterijske, kar je omejujoč dejavnik za številne aplikacije prepoznavanja konteksta. Disertacija se zato ukvarja s problemom optimizacije senzorske konfiguracije, ki išče ravnotežje med porabo energije in natančnost prepoznavanja. Upošteva, katere senzorje uporabiti, s kakšno frekvenco in kakšnimi delovnimi cikli za vsak kontekst, ki ga je treba prepoznati. Nato uporabi matematični model za napovedovanje porabe energije vsake konfiguracije. Na koncu z optimizacijskim algoritmom poišče najboljše senzorske konfiguracije, izmed katerih lahko načrtovalec sistema za prepoznavanje konteksta izbere tisto z najboljšim kompromisom med porabo energije in natančnostjo. Pristop je bil uspešno preizkušen na štirih podatkovnih zbirkah, na voljo pa je tudi odprtokodna izvedba.

The accessibility of portable sensing devices makes it possible to monitor the context of people (e.g., their activity or type of location) when desired. However, these devices are generally battery-powered, which is a limiting factor for many context-recognition applications. The dissertation thus tackles the problem of optimising the sensing configuration to balance the energy consumption and the accuracy of the recognition. It considers which sensors to use, with what frequency and what duty cycles for each context to be recognised. It then uses a mathematical model to predict the energy consumption of each configuration. Finally, it applies an optimisation algorithm to find the best sensing configurations, from which the designer of a context-recognition system can select the one with the best trade-off between energy consumption and accuracy. The approach was successfully tested on four datasets, and an open-source implementation is available.

Razmerja med uporabljenimi senzorskimi podatki in s tem povezano porabo energije ter klasifikacijsko napako pri uporabi teh podatkov za različne pristope k optimizaciji.
Trade-offs between sensor data used and the associated energy consumption on one hand, and classification error when using those sensor data on the other hand, for different optimisation approaches.

Arsim Kelmendi

Prostorsko raznoliko modeliranje slabljenja zaradi dežja v satelitskih komunikacijskih sistemih

Site diversity modelling of rain attenuation for satellite communication systems

Mentor in somentor / Supervisor and Co-Supervisor: Gorazd Kandus

V disertaciji so predstavljeni novi modeli za napovedovanje statistike slabljenja signala zaradi dežja v zemeljsko-satelitskih krajevno raznolikih komunikacijskih sistemih, ki se uporabljajo za blaženje močnih slabljenj signala zaradi dežja in s tem posledično povečujejo razpoložljivost sistema. To je izredno pomembno, saj se povpraševanje po širokopasovnih storitvah z visokimi hitrostmi prenosa podatkov povečuje tudi na geografskih območjih, kot so podeželje, otoki, morje ali gore. Na teh območjih so satelitske povezave lahko najbolj praktična, ekonomsko učinkovita in včasih celo edina izvedljiva rešitev. Glavni cilj predstavljene disertacije je bil poiskati napovedni model z dobro učinkovitostjo glede napak, brez omejitev glede števila zemeljskih postaj, z enostavno računsko implementacijo in z možnostjo razširitve vhodnih parametrov, kot je geometrija. Za dokazovanje hipoteze je bila uporabljena teorija Gaussovih in Arhimedovih funkcij kopula.

Doctoral dissertation presents novel models for prediction of rain attenuation statistics induced on Earth-satellite multiple-site diversity systems. Site diversity systems are adopted for the compensation of severe rain attenuation and therefore increase the system availability. That is of a paramount importance as the demand for broadband services with high data rates is increasing also in specific geographical areas such as rural, insular, and maritime or mountain areas, where satellite links may provide the most practical, economically efficient and sometimes even the only viable solution. The main goal of the presented dissertation was to find the prediction model with a good error performance, with no limitation in terms of number of stations, with an easy mathematical applicability and with the possibility to expand the input parameters such as the geometry. The theory of Gaussian and Archimedean copula functions was used to prove the hypothesis.

Krajevno raznolik sistem s tremi postajami.
Three-site diversity system.

Teodora Kocevska

Identifikacija lastnosti notranjega radijskega okolja na podlagi informacij o stanju kanala z uporabo pristopov strojnega učenja

Identification of indoor radio environment properties based on channel state information using machine learning approaches

Mentor in somentor / Supervisor and Co-Supervisor: Andrej Hrovat, Aleksandra Rashkovska Koceva

Teodora Kocevska je doktorski nalogi obravnavala problem opisovanja notranjega okolja z uporabo najsodobnejših brezžičnih tehnologij in pristopov strojnega učenja (ML). Predlagala, formalizirala in ovrednotila je novo, inteligentno metodologijo za karakterizacijo okolja, ki ni odvisna od specializirane opreme in temelji na informacijah o stanju kanala z uporabo pristopov strojnega učenja. Predlagana metodologija temelji na dveh predpostavkah: sprejeti signal prenaša podpis radijskega okolja (RE) in podpis radijskega okolja (RE) je mogoče oceniti z analizo brezžične povezave. Ocenila je predlagano metodologijo za prepoznavanje lastnosti materialov na podlagi impulzivnega odziva kanala (CIR) in dokazala, da je metodologijo mogoče uporabiti za prepoznavanje materialov v preprostih notranjih okoljih. Metodologija zagotavlja podlago, potrebno za razvoj najsodobnejših metod za okoljsko ozaveščene brezžične komunikacije v zaprtih prostorih, ki bodo pomembne v komunikacijah naslednje generacije.

In the thesis Teodora Kocevska addressed the problem of indoor environment characterization using state-of-the-art wireless technologies and machine learning (ML) approaches. She proposed, formalized, and evaluated novel, intelligent environment characterization methodology that do not rely on specialized equipment, and is based on channel state information using machine learning approaches. The proposed methodology is based on two assumptions: the received signal conveys a radio environment (RE) signature and the radio environment signature (RE) can be estimated by analyzing the wireless link. She evaluated the proposed methodology for identifying surface materials from channel impulse response (CIR) and proved that the methodology can be applied to identify the material of surfaces in plain indoor environments. The methodology provides the foundation needed to develop state-of-the-art methods for environmentally aware indoor wireless communications which will be important in the era of next-generation communications.

Karakterizacija notranjega radijskega okolja (RE) z uporabo podpisa RE in strojnega učenja (ML) za okoljsko ozaveščene brezžične komunikacije.
Indoor radio environment (RE) characterization using RE signature and machine learning (ML) for environment-aware wireless communications.

Tine Kolenik

Inteligentni kognitivni sistem za računsko psihoterapijo s pogovornim agentom za spreminjanje odnosa in vedênja pri stresu, anksioznosti in depresiji

An intelligent cognitive system for computational psychotherapy with a conversational agent for attitude and behavior change in stress, anxiety and depression

Mentor in somentor / Supervisor and Co-Supervisor: Matjaž Gams, Günter Schiepek

Tine Kolenik se v svoji doktorski disertaciji ukvarja z razvojem inteligentnega kognitivnega sistema za računsko psihoterapijo s pomočjo pogovornega agenta, namenjenega spreminjanju odnosa in vedenja pri stresu, anksioznosti in depresiji. Sistem združuje napredne metode umetne inteligence, vključno s teorijo uma, za boljše razumevanje in odzivanje na čustvene in kognitivne potrebe uporabnikov, kar se odraža Sistem je preizkušen v različnih kontekstih in je pokazal sposobnost, da simulira človeško razmišljanje in odzivanje, kar je ključnega pomena za učinkovito psihoterapijo. Raziskava interdisciplinarno povezuje umetno inteligenco, psihologijo in tehnologijo spremembe vedenja, s čimer zapolnjuje raziskovalne vrzeli na teh področjih. Kolenikovo delo, ki je bilo že delno objavljeno in priznano v akademski skupnosti, predstavlja pomemben prispevek k razvoju digitalnega duševnega zdravja. Do doktorata je objavil 6 člankov in dve poglavji v knjigi, skupno 21 objav.

Tine Kolenik’s PhD thesis focuses on the development of an intelligent cognitive system for computational psychotherapy using a conversational agent to change attitudes and behaviours in stress, anxiety and depression. The system combines advanced artificial intelligence methods, including a theory of mind, to better understand and respond to the emotional and cognitive needs of users, resulting in a system that has been tested in a variety of contexts and has demonstrated the ability to simulate human thinking and responding, which is crucial for effective psychotherapy. The research interdisciplinarily integrates artificial intelligence, psychology and behaviour change technology, filling research gaps in these fields. Kolenik’s work, which has already been partially published and recognised in the academic community, represents an important contribution to the development of digital mental health. He has published 6 articles and two book chapters to his PhD, a total of 21 publications.

/.
/.

Tadej Krivec

Simulacija aproksimiranih avtoregresijskih modelov na podlagi gaussovskih procesov

Simulation of approximated Gaussian process autoregressive models

Mentor in somentor / Supervisor and Co-Supervisor: Juš Kocijan

Doktorska disertacija Tadeja Krivca je pomemben prispevek k teoriji in praksi modeliranja nelinearnih dinamičnih sistemov, ključnih v tehniki in naravoslovju. Delo obravnava simulacijo podatkovno modeliranih dinamičnih procesov, osredotoča se na paradigmo modeliranja z gaussovskimi procesi (GP) ter uporablja pristop nelinearnega avtoregresijskega modela z zunanjimi vhodi (NARX). Rezultati dela se osredotočajo na informacijske in komunikacijske tehnologije, rešujejo izziv natančne in učinkovite simulacije kompleksnih verjetnostnih modelov dinamičnih sistemov, ki so doslej omejevali njeno uporabo. Glavni izziv pri teh modelih je širjenje negotovosti, ki je rešen z metodo za učinkovito simulacijo aproksimiranih GP-NARX-modelov. Ta metoda omogoča modeliranje iz velepodatkov in je bila uspešno preizkušena na kompleksnih dinamičnih sistemih. Citirane objave v visoko cenjenih revijah in praktična uporabnost pri reševanju tehničnih in naravoslovnih problemov potrjujeta vrhunskost raziskave.

The doctoral dissertation by Tadej Krivec represents a significant contribution to the theory and practice of nonlinear dynamic-systems modelling, which is crucial in engineering and sciences. The work addresses the simulation of data-modelled dynamic processes, focusing on the paradigm of modelling with Gaussian processes (GP) and employing the approach of nonlinear autoregressive models with external inputs (NARX). The results concentrate on information and communication technologies, tackling the challenge of precise and efficient simulation of complex probabilistic models of dynamic systems, which have thus far limited its application. The main challenge is the propagation of uncertainty through models, resolved through efficient simulation of approximated GP-NARX models. This method enables modelling from big data and has been successfully tested on complex dynamic systems. Cited publications in esteemed journals and the practical utility in addressing technical and scientific problems confirm the excellence of the research.

Simulirani odziv in ustrezni interval 95 % odstopanja za izbrane vremenske spremenljivke v bližini jedrske elektrarne Krško. Rezultati so prikazani med prvim in sedmim marcem 2017.
Simulated response and the corresponding interval of 95% uncertainty for the weather variables of interest near nuclear power plant Krško. The results are shown between the first and the seventh of March 2017.

Urban Kuhar

Three-phase state estimation in power distribution systems

Three-phase state estimation in power distribution systems

Mentor in somentor / Supervisor and Co-Supervisor: Aleš Švigelj, Gregor Kosec

Povečevanje deleža obnovljivih virov energije v zadnjih letih povečuje nepredvidljivost in variabilnost elektroenergetskega omrežja, saj se tovrstni viri praviloma priklapljajo v distribucijska električna omrežja, kar pa prinaša številne tehnične izzive na področju zagotavljanja zanesljivosti obratovanja omrežja. Zato je natančno poznavanja stanja distribucijskega omrežja prvi predpogoj za kakovostno upravljanje le tega. V disertaciji smo se osredinili na razvoj ocenjevalnika stanja distribucijskega omrežja pri čemer smo posebno skrb namenili robustnosti in neobčutljivosti na prisotnost slabih meritev. Razvili smo poenoten tri-fazni model veje električnega omrežja, s katerim je možno enotno predstaviti vse, za naš problem relevantne gradnike omrežja. Razvil smo tudi model občutljivosti ocenjevalnikov stanja na majhne napake parametrov modela ter majhne napake meritev. Uporabnost dosežka se kaže pri njegovi testni implementaciji na delu realnega delujočega omrežja Elektra Primorske.

The recent increase in the share of renewable energy sources has heightened the unpredictability and variability of the power grid, as these sources are usually connected to distribution networks. This poses numerous technical challenges in ensuring grid reliability. Consequently, accurate knowledge of the distribution network’s state is essential for effective management. Thus, we focused on the development of a distribution network state estimation, paying special attention to robustness and insensitivity to the presence of bad measurements. A unified branch model that allows a unified representation of three-phase distribution network elements was developed. In addition, a sensitivity model of state estimation to small model parameter errors and small measurement errors was developed. The usefulness of the achievement was demonstrated in its test implementation on a part of the real working network of Elektro Primorske.

Napredni distribucijski energetski sistemi.
Advanced power-distribution networks.

Nejc Likar

Vodenje sodelujočih robotov z uporabo navideznih mehanizmov

Control of cooperating robots by using virtual mechanisms

Mentor in somentor / Supervisor and Co-Supervisor: Leon Žlajpah

Disertacija obravnava problematiko medsebojne koordinacije več robotov, predvsem dvoročnih robotov. V delu je predlagana nova metodo združevanja serijskih robotskih mehanizmov v enotno kinematično verigo. Objekt oziroma naloga, ki jo opravljata sodelujoča robota je predstavljena kot navidezni mehanizem, ki povezuje obe robotski roki. Posebnost predlagane sheme vodenja je, da je naloga izvedena z vodenjem navideznega mehanizma, ki ima takšno strukturo, s katero najenostavneje opišemo nalogo. Predlagana shema vodenja omogoča enostaven opis naloge z dvoročnim robotom tudi v primeru, ko se katera od baz robotov premika. Poleg tega je v disertaciji obravnavano tudi, kako izkoristiti redundantnost sestavljenega sistema. Podrobneje je obravnavano, kako se uporabi redundantnost pri izvajanju naloge in sočasnem izogibanju oviram v delovnem prostoru. Za določitev kontakta in kontaktne sile je predlagan kombinirani optimizacijski postopek, ki omogoča detekcijo enega kontakta v realnem času.

The dissertation deals with the coordinated motion of robots, especially two-arm robots. The work proposes a new method of combining serial robot mechanisms into a single kinematic chain. The object or the task performed by the robots is represented as a virtual mechanism that connects the robot arms. The special feature of the proposed control scheme is that the task is performed by controlling a virtual mechanism with a structure that describes the task in the simplest way. The proposed control scheme enables a simple description of the task with a two-handed robot even in the case when one of the robot bases is moving. In addition, the dissertation also discusses how to take advantage of the redundancy of the combined system. Especially, how redundancy is used when performing a task and simultaneously avoiding obstacles in the workspace. In order to determine the contact and the contact force, a combined optimization procedure is proposed, which enables the detection of a single contact in real time.

Sekvenčne slike prikazujejo gibanje dvoročnega robota med nalogo balansiranja pladnja ter istočasnim izogibanjem človeku oziroma človeški roki v delovnem prostoru robota. Detekcija in sledenje človeka potekata v realnem času s pomočjo kamere Mirosoft Kinect.
Sequential images show the movement of a dual-arm robot during the task of balancing a tray while simultaneously avoiding a human or a human hand in the robot’s workspace. The detection and tracking of the human occur in real-time using a Microsoft Kinect camera.

Matej Martinc

Kombiniranje nevronskih in simbolnih reprezentacij za procesiranje naravnega jezika

Combining neural and symbolic representations in natural language processing

Mentor in somentor / Supervisor and Co-Supervisor: Senja Pollak

Disertacija predstavi novo strategijo kombiniranja nevronskih in simbolnih reprezentacij, s katero želimo preseči omejitve pristopov, ki temeljijo le na eni vrsti reprezentacij. S pomočjo predlaganega pristopa nam uspe razviti množico novih metod in tekstovnih reprezentacij za reševanje nalog s področja procesiranja naravnega jezika. Uporabnost strategije je prikazana na treh primerih: 1.) Profiliranju avtorjev, kjer predlagana metoda temelji na kombiniranju značilk na podlagi vreče n-gramov in nevronskih značilk, zgeneriranih s pomočjo konvolucijske nevronske mreže; 2.) Detekciji berljivosti teksta, kjer predlagamo novo mero z imenom Ranked Sentence Readability Score, v kateri so statistične značilke, pridobljene s pomočjo nevronskega jezikovnega modela, združene s plitkimi simbolnimi kazalniki berljivosti; 3.) Luščenju ključnih besed, kjer nevronski model združimo s simbolnim modelom, ki ključne besede išče s pomočjo statistike TF-IDF.

The thesis addresses a novel representation learning framework, combining neural and symbolic text representations, and demonstrates its utility for tackling diverse natural language processing problems. The proposed approach, avoiding the deficiencies of purely symbolic and purely neural methods, can be applied for the generation of efficient text representations. Its usefulness is demonstrated on three use cases: 1.) Author profiling, where we combine the bag-of-n-grams features with neural features derived from the convolutional neural network; 2.) Readability prediction, where we propose a novel Ranked Sentence Readability Score, in which statistics derived from the neural language model are combined with shallow symbolic readability indicators that consider simple text statistics; 3.) Keyword extraction, where we combine the neural model with a symbolic unsupervised TF-IDF-based keyword detector in order to improve the recall of the keyword extraction system.

Novo razviti mehanizem pozornosti, ki upošteva tudi informacije o položaju besed in je posebej prilagojen za nalogo luščenja ključnih besed.
Custom attention mechanism we developed for keyword extraction that also considers positional information..

Simon Mezgec

Zaznavanje in razpoznavanje slik hrane in pijače z uporabo globokih konvolucijskih nevronskih mrež

Food and drink image detection and recognition using deep convolutional neural networks

Mentor in somentor / Supervisor and Co-Supervisor: Barbara Koroušić Seljak

In his doctoral dissertation, Mezgec developed a novel deep learning architecture for food image recognition, called NutriNet. It modifies the well-known AlexNet architecture by increasing image size and adding an extra convolutional layer. NutriNet was trained on 225,953 images from the Internet, organized into 520 food classes. It outperformed AlexNet and GoogLeNet and was faster to train than AlexNet, GoogLeNet, and ResNet. NutriNet was used to recognize multiple items in a single food image using a training set from the ‘fake food buffet,’ resembling real food. Fully convolutional networks (FCNs), introduced by Long et al., were used for semantic segmentation, classifying each part of the image at the pixel level. The FCN-8s variant, which segments images at the finest grain, was trained on the fake food buffet dataset. The model’s predictions were compared with ground-truth labels using pixel accuracy, achieving a final accuracy of 92.18%.

In his doctoral dissertation, Mezgec developed a novel deep learning architecture for food image recognition, called NutriNet. It modifies the well-known AlexNet architecture by increasing image size and adding an extra convolutional layer. NutriNet was trained on 225,953 images from the Internet, organized into 520 food classes. It outperformed AlexNet and GoogLeNet and was faster to train than AlexNet, GoogLeNet, and ResNet. NutriNet was used to recognize multiple items in a single food image using a training set from the ‘fake food buffet,’ resembling real food. Fully convolutional networks (FCNs), introduced by Long et al., were used for semantic segmentation, classifying each part of the image at the pixel level. The FCN-8s variant, which segments images at the finest grain, was trained on the fake food buffet dataset. The model’s predictions were compared with ground-truth labels using pixel accuracy, achieving a final accuracy of 92.18%.

Primer slike, ki vsebuje več živil. Levo: originalna fotografija. Desno: rezultat prepoznavanja na ravni pikslov (prepoznavanje vsakega živila posebej).
An example of an image containing multiple food items. Left: the original photograph. Right: the result of recognition on the pixel level (the recognition of each food item in the photograph)..

Rok Pahič

Učenje dinamičnih generatorjev gibov z globokimi nevronskimi mrežami

Learning of dynamic movement primitives with deep neural networks

Mentor in somentor / Supervisor and Co-Supervisor: Aleš Ude

Učinkovito robotsko učenje je predpogoj za razvoj avtonomnih robotskih sistemov, ki znajo izvajati različne naloge v hitro spreminjajočih se okoljih. Tehnologije globokega učenja so prispevale k pomembnim prebojem na področju računalniškega vida in obdelave naravnih jezikov, v robotiki pa so bile tovrstne metode zaenkrat manj uspešne. Hipoteza doktorskega dela dr. Pahiča je bila, da je glavni problem pri uporabi globokih nevronskih mrež v robotiki pomanjkanje podatkov in natančnost, ki je potrebna za uspešno delovanje robotov v fizičnih okoljih. Dr. Pahič je razvil nove metode za učenje globokih nevronskih mrež, s katerimi lahko ustrezno obravnavamo razlike med robotskimi gibi. S tem je omogočil natančno računanje robotskih operacij, ki so primerne za izvedbo robotskih nalog v stiku z okoljem. Poleg tega je v svojih raziskavah razvil nove matematične predstavitve robotskih operacij, ki omogočajo uporabo klasičnih metod robotskega učenja tudi ob manjši količini razpoložljivih podatkov.

The ability to learn is essential for the development of autonomous robotic systems that can operate in rapidly changing environments. Deep learning technologies have contributed to significant breakthroughs in the fields of computer vision and natural language processing. However, these methods have so far been less successful in robotics. The main reason for the limited success of deep neural networks in robotics is the lack of data and the precision required for successful operation of robots in physical environments. In his thesis dr. Pahič developed new methods for training deep neural networks that appropriately address differences in robotic movements. This has enabled precise computation of robotic operations suitable for performing tasks in interaction with the environment. Additionally, his research introduced new mathematical representations of robotic operations that allow the use of classical robotic learning methods even with a smaller amount of available data.

Pisanje številk s humanoidnim robotom TALOS. Številke so zaznane in preslikane v ustrezno robotsko gibanje z uporabo konvolucijskih mrež za kodiranje in dekodiranje slik v trajektorije. Za predstavitev robotskih trajektorij so bili uporabljeni dinamični generatorje giba po naravnem parametru.
Writing digits with the humanoid robot TALOS. The digits are detected and mapped to a suitable robotic handwriting motion using convolutional image to motion encoder-decoder network. Arc length dynamic movement primitives were used to represent robot writing trajectories.

Tanja Pavleska

Zmanševanje upliva pristranosti uporabnikov v spletnih sistemih zaupanja

Alleviating user bias in online trust systems

Mentor in somentor / Supervisor and Co-Supervisor: Borka Jerman Blažič

Zaupanje v spletno okolje je stalni izziv, še posebej pomemben v luči napredka generativne umetne inteligence. Pristranskosti v oblikovanju sistemov in algoritmov ter pristranskosti v percepcijah in dejanjih uporabnikov znotraj teh sistemov lahko pomembno vplivajo na delovanje interneta kot celote. V svoji doktorski disertaciji z naslovom ” Zmanševanje upliva pristranosti uporabnikov v spletnih sistemih zaupanja ” je dr. Tanja Pavleska raziskala razvijajočo se pokrajino spletnih sistemov zaupanja in njihovo skladnost z uporabniškimi percepcijami in vedenji. Z analizo sistemskih značilnosti zaupanja in uvajanjem novega okvira, ki temelji na načelih sistemskega oblikovanja, je disertacija obravnavala pomembne vrzeli v oblikovanju in izvajanju trenutnih modelov zaupanja. S študijami primerov in empirično analizo je disertacija identificirala konkretne kognitivne uporabniške pristranskosti, ki vplivajo na delovanje spletnih modelov zaupanja in uvajajo neskladja med izvedbo sistema in percepcijami uporabnikov. Na podlagi ugotovitev je disertacija predlagala metodologijo za identifikacijo in zmanjšanje uporabniških pristranskosti ter pokazala izboljšave v delovanju sistema in uporabniški izkušnji. Ugotovitve poudarjajo pomen integracije zavedanja o zaupanju v spletna omrežja in izpostavljajo pomen multidisciplinarnih pristopov pri oblikovanju digitalnega ekvivalenta družbenega zaupanja. Poleg tega ponujajo dragocene vpoglede za vzpostavljanje medosebnega in medinstitucionalnega zaupanja v vse bolj digitaliziranem svetu.

Trust in the online environment is an ongoing challenge, particularly significant in light of the advancements in generative Artificial Intelligence. Biases in system and algorithm design, along with biases in users’ perceptions and actions within these systems, can significantly impact the functioning of the Internet as a whole. In her PhD thesis “Alleviating user bias in online trust systems”, dr. Tanja Pavleska investigated the evolving landscape of online trust systems and their alignment with user perceptions and behaviors. By analyzing the systemic features of trust and introducing a novel framework based on Systems design principles, the thesis addressed important gaps in the design and implementation of current trust models. Through case studies and empirical analysis, the thesis identified concrete cognitive user biases that affect the performance of online trust models and that introduce discrepancies between the system implementation and the user perceptions. Based on the findings, the thesis proposed a methodology for user bias identification and mitigation, demonstrating the improvements of both system performance and user experience. The findings underscore the importance of integrating trust awareness into online networks, highlighting the relevance of multidisciplinary approaches in shaping the digital analogue of social trust. Moreover, they offer valuable insights for the establishment of interpersonal and inter-institutional trust in the increasingly digitalized world.

Analiza različnih pogojev za pristranskost uporabnikov na spletnih platformah.
Analysis of different conditions for user bias across online platforms.

Matic Perovšek

Napredni delotoki za procesiranje tekstov v spletni platformi za tekstovno rudarjenje

Advanced text processing workflows in a web-based text mining platform

Mentor in somentor / Supervisor and Co-Supervisor: Bojan Cestnik, Nada Lavrač

Kljub razpoložljivim orodjem, kot sta NLTK in scikit-learn, sta ponovljivost in sistematična primerjava algoritmov še vedno izziv. Ta diplomska naloga obravnava vrzeli v TM in NLP prek treh raziskovalnih smeri: izboljšanje infrastruktur, oblikovanje naprednih delovnih tokov za NLP in bisociativno odkrivanje znanja ter pretvorba relacijskega podatkovnega rudarjenja v naloge besedilnega rudarjenja. Predstavlja TextFlows, spletno platformo za konstruiranje in izvajanje delovnih tokov TM in NLP. Scenariji odkrivanja novega znanja so obravnavani s predlaganimi metodologijami in ovrednoteni na nalogah iz resničnega sveta. Področja na katera se osredotočamo vključujejo napredne poteke dela NLP, bisociativno odkrivanje znanja in transformacijo relacijskega podatkovnega rudarjenja v problem tekstovnega rudarjenja, kjer se je predlagani wordification pristop izkazal kot izjemno učinkovit. Referenčne izvedbe so na voljo v platformi TextFlows, kar omogoča skupno rabo delovnega toka in ponovljivost eksperimenta.

Text mining blends techniques from various fields like machine learning, NLP, and information retrieval. Despite available tools like NLTK and scikit-learn, reproducibility and systematic comparison of algorithms remain challenging. This thesis targets TM and NLP gaps via three research directions: enhancing infrastructures, crafting advanced workflows for NLP and bisociative knowledge discovery, and converting relational data mining into text mining tasks. It introduces TextFlows, a web-based platform for constructing and executing TM and NLP workflows. Novel knowledge discovery scenarios are addressed with proposed methodologies and evaluated on real-world tasks. Focus areas include advanced NLP workflows, bisociative knowledge discovery, and transforming relational data mining into text mining problems, where the developed wordification approach proved to be very effective. Reference implementations are provided in the TextFlows platform, enabling workflow sharing and experiment repeatability.

Transformacija relacijske podatkovne baza v predstavitev v obliki vreče besed.
The transformation from a relational database representation into a Bag-ofWords feature vector representation.

Rok Piltaver

Gradnja razumljivih in točnih klasifikatorjev z algoritmi za iskanje zakonitosti v podatkih

Constructing comprehensible and accurate classifiers using data mining algorithms

Mentor in somentor / Supervisor and Co-Supervisor: Matjaž Gams, Mitja Luštrek

V svoji disertaciji se je osredotočil na gradnjo razumljivih in točnih klasifikatorjev z algoritmi za iskanje zakonitosti v podatkih. V področju rudarjenja podatkov se oblikuje veliko modelov pod različnimi parametri. Piltaver je uvedel dve novosti: v enem koraku je oblikoval celotno Pareto fronto, začenši od najbolj transparentnega in natančnega modela. S petimi dokazanimi teoremi je pokazal več želenih lastnosti pristopa, npr.: nov algoritem za večciljno učenje hibridnih klasifikatorjev dokazano najde popoln nabor nedominantnih hibridnih dreves glede na njihovo natančnost in razumljivost. Druga novost je podroben pregled vpliva števila listov, faktorja razvejanja, globine drevesa in različnih lastnosti vizualizacije odločitvenih dreves na transparentnost.

In his PhD, he focused on building understandable and accurate classifiers with algorithms to find patterns in data. In the field of data mining, many models are built under different parameters. Dr Piltaver has introduced two innovations: it has modelled the entire Pareto front in one step, starting from the most transparent and accurate model. With five proven theorems, he demonstrated several desirable properties of the approach, e.g.: a new algorithm for multi-objective learning of hybrid classifiers is proven to find the complete set of non-dominated hybrid trees in terms of their accuracy and understandability. Another novelty is the detailed examination of the impact of the number of leaves, branching factor, tree depth and various visualisation properties of decision trees on transparency.

/.
/.

Jose Martin Rožanec

Napovedovanje povpraševanj z metodami strojnega učenja

Demand forecasting with machine learning methods

Mentor in somentor / Supervisor and Co-Supervisor: Dunja Mladenić, Blaž Fortuna

Doktorska disertacija preučuje uporabo strojnega učenja za napoved povpraševanja. Predlagali smo novi pristop za napoved občasnega povpraševanja, ki ima redno manjše ali večje razlike v povpraševanih količinah. Pristop deli napoved občasnega povpraševanja v dva modela: klasifikacijski model za napoved samega dogodka povpraševanja in regresijski model za napoved povpraševane količine. Delitev napovedi povpraševanja na dva dela omogoča boljši vpogled v samo napoved in razumevanje vzrokov netočnosti: ali napoved ni točna zaradi slabe napovedi glede samega dogodka povpraševanja, ali napoved ni točna zaradi napovedi povpraševane količine. Za primerno ovrednotenje kakovosti napovedi predlagamo tri metrike: AUC ROC za ovrednotenje natančnosti klasifikacijskega modela, MASE za ovrednotenje natančnosti regresijskega modela ter SPEC za razumevanje vpliva napovedi na stroške zalog.

The thesis examines how machine learning can be applied in demand forecasting. In particular, it describes a novel approach toward lumpy and intermittent demand forecasting. It advocates using a two-fold model for forecasting lumpy (irregular demand occurrence, strong demand size variability) and intermittent (irregular demand occurrence, little demand size variability) demands when considering fixed forecasting horizons. The two-fold model comprises a classification model to predict demand occurrence and a regression model to predict the demand quantity. By dividing the forecasting problem into these two dimensions, insights into the reasons affecting the forecast quality are exposed. Furthermore, a new set of metrics is proposed to assess the quality of the forecasts: ROC AUC to determine the quality of the classifier, variations of the MASE metric to measure the quality of the regression component, and the SPEC metric to determine the impact of the forecast on the stock costs.

Pristop strojnega učenja k napovedovanju povpraševanja z dvemi modeli. (A) prikazuje osnovno arhitekturo za napovedovanje povpraševanja s klasifikacijskim modelom za napoved samega dogodka povpraševanja, in regresijskim modelom za napoved povpraševane količine. (B) prikazuje diagram poteka s koraki za gradnjo in uporabo modelov za napovedovanje povpraševanja.
Two-fold machine learning approach to demand forecasting. (A) shows a basic architecture for demand forecasting when reframing demand forecasting as classification and regression problems. (B) shows a flowchart with steps followed to create the demand forecasting models and issue demand forecasts.

Abdul Sittar

Analiza preprek za širjenje novic

Information spreading barriers in news

Mentor in somentor / Supervisor and Co-Supervisor: Dunja Mladenić

Abdulove raziskave se ukvarjajo z računalniško lingvistiko, strojnim učenjem in obdelavo naravnega jezika, zlasti z raziskovanjem ovir za širjenje informacij v novicah. Disertacija se osredotoča na tri probleme, povezane s preprekami za širjenje novic. Prvi problem vključuje uporabo teorije informacijskih kaskad in analizo dogodkovno usmerjenega širjenja novic. Drugi problem je tematsko modeliranje in pristopa k razumevanju političnih in ekonomskih razlik v poročanju novic. Tretji je profiliranje preprek za širjenje novic, zasnovano na stilu poročanja. Pri svojem delu je sodeloval z družboslovnimi raziskovalci na School of Advanced Study Univerze v Londonu in prispeval k projektom analize podatkov za The British Library.

Abdul’s research delves into computational linguistics, machine learning, and natural language processing, particularly exploring information spreading barriers in news. PhD thesis focuses on three interconnected issues to the news spreading barriers. The first issue involves adopting information cascade theory for news articles and event-centric news dissemination analysis. The evaluation of the improved topic modeling and the strategy for comprehending political and economic contrasts in news reporting constitute the second issue. The third issue is to profile the news spreading barrier, a task to classify news texts based on the stylistic choices of their news publishers. Collaborating across disciplines, he partnered with social science researchers at the School of Advanced Study, University of London, and contributed to barrier analysis projects at The British Library, UK.

Slika ponazarja zaporedje nalog v cevovodu, pri čemer so poudarjene primarne teme v zvezi z ovirami za širjenje novic, ki smo jih v doktoratu naslovili.
The figure illustrates the sequence of tasks in a pipeline, highlighting the primary subjects explored related to news spreading barriers throughout the course of PhD studies.

Jasmina Smailović

Analiza sentimenta v tokovih kratkih spletnih sporočil

Sentiment analysis in streams of microblogging posts

Mentor in somentor / Supervisor and Co-Supervisor: Martin Žnidaršič, Nada Lavrač

Dr. Jasmina Smailović je v svoji disertaciji Analiza sentimenta v tokovih kratkih spletnih sporočil razvila metodologijo za analizo sentimenta, prilagojeno posebnostim kratkih spletnih sporočil in pretočnih podatkov v statičnih in dinamičnih okoliščinah. V statičnih se klasifikator nauči enkrat in ostane nespremenjen, v dinamičnih pa se klasifikator sproti prilagaja toku prispelih podatkov. Ker sporočila ne izražajo nujno le pozitivnih ali negativnih mnenj, je formalizirala dva koncepta nevtralne cone, ki omogoča binarnemu klasifikatorju metode podpornih vektorjev, da podatke razpozna tudi kot nevtralne. Razvito metodologijo analize sentimenta je empirično ovrednotila s tokovi sporočil o financah izbranih podjetij in jo uspešno uporabila v realni aplikaciji za spremljanje javnega sentimenta v zvezi s slovenskimi in bolgarskimi volitvami. Njeni dosežki v zvezi z disertacijo so bili objavljeni v štirih člankih v revijah in več prispevkih na znanstvenih konferencah.

Dr. Jasmina Smailović has in her dissertation Sentiment Analysis in Streams of Microblogging Posts developed a methodology for sentiment analysis adjusted to specifics of microblogging messages and streaming data in static and dynamic settings. In the static settings, the classifier is trained once and remains unchanged, while in dynamic ones the classifier is continuously adapted to streams of arriving data. As the messages do not necessarily express positive or negative opinions, she formalized two concepts of neutral zone, which allows a binary Support Vector Machine classifier to classify tweets also as neutral. The developed sentiment analysis methodology was empirically evaluated with streams of messages discussing finances of a selection of companies and was successfully used in real-world applications for monitoring the public sentiment related to Slovenian and Bulgarian elections. Her achievements related to the thesis were published in four journal papers and several conference contributions.

Koncept relativnega nevtralnega območja z zanesljivostjo kot funkcijo razdalje do SVM hiperravnine.
The concept of relative neutral zone with reliability as a function of the distance from the SVM hyperplane.

Blaž Škrlj

Učinkovito nevro-simbolno strojno učenje

Scalable Neuro-Symbolic Machine Learning

Mentor in somentor / Supervisor and Co-Supervisor: Nada Lavrač

Disertacija se osredotoča na nevro-simbolno učenje s kombiniranjem nevronskega in simbolnega tehnike strojnega učenja s ciljem razvoja razložljivih in skalabilnih napovednih modelov iz table, grafov in tekstovnih podatkov. Raziskali smo prednosti nevro-simbolnih pristopov za klasifikacijo omrežnih vozlišč, tekstov in za rangiranje značilk. Glavni prispevki vključujejo metode za nevro-simbolno učenje iz relacijskih podatkov in vložitve omrežnih vozlišč. Razvili smo orodje autoBOT za avtomatizacijo klasifikacije tekstov, ki združuje simbolne in subsimbolne reprezentacije, dodatno obtežene z evolucijsko optimizacijo, kar omogoča visoko klasifikacijsko točnost in dobro razložljivost. Zadnji del disertacije se osredotoča na rangiranje značilk, kjer smo izboljšali razumevanje odnosa med nevronsko pozornostjo in rangiranjem ter izboljša skaliranje algoritmov iz družine Relief.

The thesis focuses on neuro-symbolic learning, combining neural and symbolic machine learning techniques, with the aim to develop explainable and scalable predictive models learned from tables, graphs, and text data. We investigated the benefits of the neuro-symbolic paradigm for learning network node representations, classifying texts, and ranking features. Key contributions include applying neuro-symbolic learning to relational databases and network node embedding. We developed autoBOT, a neuro-symbolic autoML system for text classification that uses evolution-based optimization for interpretable models at both feature type and individual feature levels. The last part of the thesis explores feature ranking, showing that the neuro-symbolic approach enhances understanding of the relationship between neural attention and ranking, and improves scaling of Relief-based methods.

Gruče odkrite s Silhouette pristopom za odkrivanje skupnosti iz vektorskih predstavitev omrežnih vozlišč.
Communities in the Human Afnome network, discovered by s the embedding-based Silhouette community detection (SCD) approach for detecting communities, based on clustering of network node embeddings.

Urban Škvorc

Razumevanje vpliva problemskih pokrajin v numerični optimizaciji črne škatle

Towards understanding the impact of problem landscapes in numerical black-box optimization

Mentor in somentor / Supervisor and Co-Supervisor: Peter Korošec, Tome Eftimov

Doktorska disertacija je osredotočena na analizo optimizacijskih problemov z uporabo metode analize preiskovanja. V disertaciji je izvedel analizo komplementarnosti najbolj znanih optimizacijskih testnih okolij. Pokazal je, da okolja vsebujejo različne optimizacijske probleme, ki posledično vplivajo na različne rezultate primerjalnih analiz optimizacijskih algoritmov. Nato se je osredotočil na najbolj pogosto uporabljeno metodo analize pokrajine za opis značilk optimizacijskih problemov, kjer je preveril vpliv transformacij premika in skaliranja na izračune značilk. Pokazal je da je relativno malo značilk, ki so invariantne na ti dve preprosti transformaciji, ter da lahko z njimi relativno dobro opišemo probleme v primerjavi z uporabo celotnega nabora značilk. Nazadnje je še eksperimentalno preveril in pokazal, da s trenutnim naborom značilk za opisovanje problemov ni možno izvesti generalizacije napovedovanja zmogljivosti optimizacijskega algoritma. Svoje delo nadaljuje na podoktorskem mestu v skupini za strojno učenje in optimizacijo na Univerzi v Paderbornu, Nemčija.

The doctoral dissertation is focused on the analysis of optimization problems using the exploratory landscape analysis. In his dissertation, he analyzed the complementarity of the well-known optimization benchmarks. He showed that benchmarks contain different optimization problems, which in turn affect the benchmarking results of optimization algorithms. He then focused on the most commonly used exploratory landscape analysis methods for characterizing optimization problems, where he examined the impact of translation and scaling on feature calculations. He showed that there are relatively few features that are invariant to these two simple transformations, and that we can use them to describe problems relatively well compared to using the entire set of features. Finally, he experimentally verified and showed that with the current set of features for describing problems, it is not possible to generalize the performance prediction of the optimization algorithm. He continues his research in the postdoctoral position within the Machine Learning and Optimisation group at Paderborn University, Germany.

Rezultati izračuna Pearsonove korelacije med izbranimi značilkami pridobljenimi z uporabo raziskovalne analize pokrajine. Večji in krepkejši krogi predstavljajo višjo korelacijo, ki je pozitivna (modra) ali negativna (rdeča). Rdeče oznake predstavljajo funkcije, ki so bile obdržane za nadaljnje eksperimente, medtem ko črne oznake predstavljajo funkcije, ki so bile odstranjene iz nabora funkcij. Slika vzeta iz doktorske disertacije.
The results of the Pearson correlation calculation between selected exploratory landscape analysis features. Bigger and bolder circles represent higher correlation, which is either positive (blue) or negative (red). Red labels represent the features that were retained for further experiments, while black labels represent features that were removed from the feature set. Figure taken from doctoral dissertation.

Aleš Tavčar

Učenje in kloniranje obnašanja v varnostnih domenah

Behaviour analysis and cloning in security domains

Mentor in somentor / Supervisor and Co-Supervisor: Matjaž Gams

V svoji doktorski disertaciji z naslovom “Učenje in kloniranje obnašanja v varnostnih domenah” je raziskoval metode strojnega učenja in agentnih sistemov za razvoj računalniško podprtega usposabljanja npr. v varnostnih nalogah. Disertacija se osredotoča na analizo, identifikacijo in kloniranje značilnih vzorcev obnašanja z uporabo več-agentnih sistemov, kar omogoča boljše razumevanje in usposabljanje kadrov v varnostnih domenah. Kandidat predlaga inovativne pristope za analizo kognitivnega stanja in obnašanja ter razvija algoritme, ki bi identificirali vzorce obnašanja iz gibanja skupine nasprotujočih si agentov. Prav tako je razvil ogrodje za kloniranje obnašanja, ki izboljša procese usposabljanja in skrajša čas doseganja usposobljenosti. Delo, ki združuje več vidikov analize in simulacije obnašanja, med drugim tudi prikaže načine za izboljšanje delovanja v konfliktnih situacijah, ima potencial za aplikativno uporabo v različnih varnostnih scenarijih. Sodeloval je v več evropskih projektih in razvil inovativne rešitve na EU nivoju.

In his PhD thesis “Learning and cloning behaviour in security domains”, he investigated machine learning and agent-based methods for the development of computer-based training, e.g. in security tasks. The thesis focuses on the analysis, identification and cloning of typical behaviour patterns using multi-agent systems, which allows for a better understanding and training of personnel in security domains. The candidate proposes innovative approaches to analyse cognitive state and behaviour and develops algorithms to identify behaviour patterns from the movement of a group of antagonistic agents. He has also developed a behaviour cloning framework to improve training processes and reduce time to competence. This work, which combines several aspects of behaviour analysis and simulation, including demonstrating ways to improve performance in conflict situations, has the potential to be applied in a variety of security scenarios. He has participated in several European projects and developed innovative solutions at the EU level.

/.
/.

Nejc Trdin

Platforma nove generacije za večkriterijsko kvalitativno modeliranje z metodo DEX

New generation platform for multi-criteria qualitative decision modelling with method DEX

Mentor in somentor / Supervisor and Co-Supervisor: Marko Bohanec

DEX je kvalitativna metoda večkriterijskega modeliranja, namenjena podpori odločevalcem pri zapletenih nalogah odločanja, ki vključujejo več v splošnem nasprotujočih si kriterijev. Modeli DEX so sestavljeni iz kvalitativnih drevesno strukturiranih atributov in odločitvenih tabel. DEX je bil uporabljen v številnih praktičnih aplikacijah po vsem svetu. Ta disertacija razvija in razširja DEX v naslednjih smereh: podpiranje hierarhij atributov (namesto le dreves), uvajanje numeričnih atributov in njihovo kombiniranje s kvalitativnimi, razširitev funkcij združevanja na druge tipe, omogočanje verjetnostnega in mehkega združevanja vrednosti ter podpiranje relacijskih (»ena proti mnogo«) modelov. Razširjena metoda se imenuje DEXx. Metoda je bila realizirana v obliki programske knjižnice in ovrednotena na štirih zahtevnih praktičnih odločitvenih problemih. Razširitve povečajo izrazno moč metodologije in olajšajo reševanje večjega razreda odločitvenih problemov.

DEX is a qualitative multi-criteria decision modelling method aimed at supporting decision makers in complex decision-making tasks involving multiple and possibly conflicting criteria. The DEX models are composed of qualitative tree-structured attributes and decision tables. DEX has been used in numerous practical applications worldwide. This thesis develops and extends DEX in the following directions: supporting full hierarchies of attributes (rather than trees), introducing numeric attributes and combining them with qualitative ones, extending aggregation functions to other types than decision tables, facilitating probabilistic and fuzzy aggregation of values, and supporting relational (»one-to-many«) models. The extended method is called DEXx. It was developed in a software library and evaluated through four complex real-life decision-making use-cases. The extensions increase the expressive power of the methodology and facilitate solving of a larger class of decision problems.

Tri metodološke razširitve v okviru DEXx: uvajanje numeričnih atributov, verjetnostno in mehko računanje ter podpora relacijskih modelov.
Three methodological extensions in DEXx: introducing numeric attributes, facilitating probabilistic and fuzzy computations, and supporting relational models.

Anže Vavpetič

Semantično odkrivanje podskupin

Semantic subgroup discovery

Mentor in somentor / Supervisor and Co-Supervisor: Nada Lavrač

Disertacija obravnava semantično odkrivanje podskupin (SOP), združitev relacijskega podatkovnega rudarjenja in tehnologij semantičnega spleta. SOP izboljšuje tradicionalno odkrivanje podskupin z ontološkim in strukturnim predznanjem. Razvili smo okvir in algoritme SOP: SDM-SEGS, SDM-Aleph in Hedwig, ki smo jih ocenili na mikromrežah, pri čemer se je Hedwig odrezal najbolje. Programska oprema je odprtokodna, dostopna kot Python paketi in ClowdFlows gradniki. SOP smo uporabili pri razlagi podskupin bolnic z rakom na prsih, analizi DNK aberacij in finančnih novicah. Pri raku smo uporabili Gene Ontology in potrdili rezultate prejšnjih raziskav. Pri DNK smo s SOP, združenim z gručenjem in vizualizacijo, pridobili nova spoznanja. Pri analizi finančnih novic smo povezali percepcijo trga s članki. Disertacija prispeva k relacijskemu podatkovnemu rudarjenju z razvito knjižnico in gradniki za ClowdFlows, kar olajša dostop do metod relacijskega podatkovnega rudarjenja.

The thesis addresses semantic subgroup discovery (SSD), merging relational data mining and semantic web technologies. SSD improves subgroup discovery using ontological and structural background knowledge. We developed an SSD framework and new algorithms: SDM-SEGS, SDM-Aleph, and Hedwig. Evaluations on microarray datasets showed Hedwig performed best. Our open-source software is available as Python packages and ClowdFlows widgets. We applied SSD to three real-life scenarios: breast cancer patient subgroups, DNA aberration analysis, and financial news articles. For breast cancer, Gene Ontology was used for subgroup explanations, validating against existing studies. In DNA aberration analysis, SSD, with clustering and visualization, provided novel insights. For financial news, we linked market perception to news articles using SSD. The thesis advances relational data mining by developing a Python library and ClowdFlows widgets, facilitating easier access to RDM methods.

Delotok evalvacije in primerjave sistemov Wordification, Aleph, RSD in Relf, implementiran v platformi ya podatkovno rudarjenje ClowdFlows.
Evaluation workfow for evaluating and comparing Wordifcation, Aleph, RSD, and RelF, implemented in the ClowdFlows data mining platform.

Aljoša Vodopija

Karakterizacija zveznih večkriterijskih optimizacijskih problemov z omejitvami

Characterization of Constrained Continuous Multiobjective Optimization Problems

Mentor in somentor / Supervisor and Co-Supervisor: Bogdan Filipič

Aljoša Vodopija je v svoji doktorski disertaciji proučeval reševanje optimizacijskih problemov z več nasprotujočimi si kriteriji in omejitvami, ki jih ni moč obravnavati matematično in se jih zato lotevamo z metodami računske inteligence, predvsem z metahevristični algoritmi. Za povečanje učinkovitosti teh algoritmov je razširil koncept problemske pokrajine, poznan iz enokriterijske optimizacije, na večkriterijske probleme z omejitvami, opredelil značilke, s katerimi opšemo lastnosti problemov, in definiral indikator kakovosti rešitev problema, ki upošteva tako izpolnjevanje omejitev kot konvergenco rešitev. S tem je opravil pomemben korak v smeri boljšega razumevanja problemov in izbire algoritmov v večkriterijski optimizaciji z omejitvami. Rezultate raziskav je kot prvi avtor objavil v treh revijah s faktorjem vpliva v prvi četrtini, med katerimi posebej izstopa revija IEEE Transactions on Evolutionary Computation, in v šestih konferenčnih prispevkih.

In his doctoral dissertation, Aljoša Vodopija dealt with optimization problems involving several conflicting criteria and constraints, which cannot be treated mathematically and are therefore tackled with computational intelligence methods, especially metaheuristic algorithms. To increase the efficiency of these algorithms, he extended the concept of the problem landscape, known from single-objective optimization, to multiobjective problems with constraints. He also introduced features to describe problem properties and defined a quality indicator for problem solutions considering both the satisfaction of constraints and the convergence of solutions. This represents an important step towards better problem understanding and algorithm selection in constrained multiobjective optimization. He published his results in three journals with impact factors in the first quarter, among which IEEE Transactions on Evolutionary Computation stands out, and in six conference papers.

Vizualizacija problemske pokrajine večkriterijskega optimizacijskega problema z omejitvami.
Visualization of the problem landscape of a constrained multiobjective optimization problem.