<TITLE: Gaze Path Research Seminar
ACADEMIC DOMAIN: technology
DISCIPLINE: information technology
EVENT TYPE: seminar presentation
FILE ID: USEMP04D
NOTES: continuation of and continued in USEMD090, seminar also includes presentations USEMP04A/E

RECORDING DURATION: 84 min 44 sec

RECORDING DATE: 4.3.2004

NUMBER OF PARTICIPANTS: circa 30

NUMBER OF SPEAKERS: 3

NS2: NATIVE-SPEAKER STATUS: English (UK), French; ACADEMIC ROLE: senior staff; GENDER: male; AGE: 51-over

S3: NATIVE-SPEAKER STATUS: Croatian; ACADEMIC ROLE: senior staff; GENDER: male; AGE: 31-50

S8: NATIVE-SPEAKER STATUS: Finnish; ACADEMIC ROLE: senior staff; GENDER: male; AGE: 31-50

SU: unidentified speaker

SS: several simultaneous speakers>


<S3> <COUGH> thank you very much for having me erm here erm and er er let me tell you immediately first dispel all the rumours i'm not industrial designer <SS> @@ </SS> so this is not going to be from the point of view of er er design er neither am i a computer scientist so it's not going to be about programming <COUGH> er my er er interest is primarily in humans er i am a er a neuropsychologist and cognitive scientist so <COUGH> what i'll try to do is er look at how humans interact with a what i call digital environments and er er different kinds of er interaction devices different interfaces and er i'm also not an expert in er attentive er interfaces there there's plenty of of work done and i'll er er quote er er many things that were done by other people er but er er well others may know a lot and then then have er built ingenious er er solutions in terms of er er you know they they've er interaction devices er i think that knowledge of the humans <COUGH> is still extremely important because ultimately everything that we do is for humans and for human users so that's where the focus of my interest is er the other focus is i'm at the university of the arts er in philadelphia and er er i have a long-term interest in the arts and being a an artist myself so what you'll see today will be er in the context of er er building these kinds of interfaces er er in the area of art exploration er creating a personal relationship with works of art and art education er however all of these can be generalised to many <COUGH> (xx) so i'll start by telling you mhm er how i started working on @on@ this presentation and er <COUGH> it was er actually very common situation i had er microsoft word i i copied actually the the summary from the website that <NAME S1> has provided er and er <COUGH> i started writing these very deep thoughts er which now i forgot i thought i really wanted to say something very <COUGH> intelligent erm and while i was thinking about er well this is what happened er my screensaver <SS> @@ </SS> @kicked in@ and @completely@ disrupted the train of of thought er er and i had to go back and think again and <COUGH> and i was again on the verge of having this complete thought in my mind and you know what happened again with the screen- screensaver came in and this shows that picture of the screensaver actually it's more animated it puts these (xx) in front of you and it really attracts your attention mhm but er i never managed to recover <COUGH> that thought that would be extremely valuable at least for me @i don't know@ maybe not so much for you but er and i started asking <COUGH> myself why do things like this happen to me all the time then i realised it they happen to everyone and er , the answer is <COUGH> er things like this happen because the tool that i was using the word processor the computer er was not aware simply of me being there at work and er <COUGH> so what <COUGH> it created because of the lack of awareness er it created this er kind of intrusion er and er it should be often it may seem like a pretty small thing er er you'll see later on that er these intrusions er have very detrimental effect not only on our thinking but also how we communicate with others er so <COUGH> i started er then to look at this problem er the problem of awareness er of er our tools or our environments and er the the person who really did er er er er pioneering work er in terms of a er adaptive user interfaces is vertegaal and er what er historically the way he trained this the problem was er er that er the situation today er is very different than it was er let's say 20 25 years ago where er from 60s to 80s er you had er many users of a single system and and er these are the computers so at at that time they were the mainframe computers er that er served the needs of many users er er from 80s to 90s with the invention of personal computer er this relationship changed to er one to one there was er one person and er er er one tool one digital tool er from nine er from the 90s to er er er 2000 er the number of these tools er that are digitally supported er has grown er to many tools that we use er er from er cell phones to pagers to er personal digital assistants PDAs to er GPS devices and god knows what so the relationship suddenly changed <COUGH> er from er one to one to one person to many devices er and er what's going on right now as we <COUGH> as we speak er is that er this is going to change to er many people interacting with many devices in the same time so it it won't be just one person having er a number of devices that he or she interacts with er but there'll be devices everywhere er that everyone will be interacting with and er <COUGH> so what <COUGH> what happens is er er these devices er in order to interact with us er they have to er claim our attention they have somehow to er initiate interaction with us and er <COUGH> what i decided to call this <COUGH> in terms of functional definition not any more historical er that what this is creating is an intrusive interruption it's calling you it's claiming your attention oh by the way if anyone has a cell phone that's turned on please turn them off or put them in vibrate you know @@ because it's er i know how many times you have experienced that er <COUGH> in the middle of a movie or a er lecture and then someone's cell phone er rings and how actually intrusive and <COUGH> it is and how much it interrupts er so the devices that <COUGH> er right now that are in use are all over the are trying to interrupt us to er claim our attention and er <COUGH> many of them are doing this at the same time so if i go to my office at the university of the arts <COUGH> i have my cell phone i have my palm pilot er er i have a computer i have a regular phone that comes with the er office and there is a microwave in the hall <COUGH> that's beeping all the time and you know someone has er heated up their coffee for a a minute er and it tells the whole world that it's done by this very er irritating beeping noise and er <COUGH> very often i find myself <COUGH> er literally bombarded by <COUGH> these er calls for attention this look at me look at me pay attention to me this is important it's er er and sometimes they happen simultaneously i actually for some strange reason i i'm not superstitious but there's some really strange reason that all these things er have a tendency of happening at exactly the same time so i know that <COUGH> when my office phone rings and i'm talking with the department head that my cell phone will ring and it will be a friend of mine that wants to ask me to go out for a movie and that my computer will suddenly go berserk with those new e-mails er er will beeps like da-da-da-daa or whatever you use <COUGH> er and er the net result of this <COUGH> is er i don't do any work anymore <COUGH> @i am just@ subjected to these intrusive interruptions and my life is shattered and er i didn't manage to finish this presentation @no i'm joking@ <SS> [@@] </SS> [@@] i did it on the aeroplane because you get to turn off the <COUGH> the cell phones and pagers er <COUGH> so er but it really is to and we're er maybe not erm aware of this but er mhm er let me go back to this slide that er how detrimental these <COUGH> er interruptions are to our work-flow and to our thinking there were some studies er now <COUGH> done in large corporations their cap- collectivity er declined and they were asking questions so why are people not working as much as they used to two years ago the same people <COUGH> well they realised <COUGH> that the biggest er er factor in production decline was checking for e-mail they did it that much then these little beeps that alert to that a new e-mail has come to your mailbox er they you interrupt whatever you're doing and you go and look <COUGH> who is writing an e-mail and er <COUGH> right now the percentage of e-mail that's not really work-related or or personally related the so-called spam mail is 70 per cent of all messages that are in circulation right now are spam mail which means that they're not for you they're not related to er what you're doing they're just there to distract you so <COUGH> a a very then er er er after this finding it was easy to fix the problem just by , telling the employees not to check their e-mail or having them check it at certain periods er during the day so every two hours or you know which kind of defeats the purpose because it becomes like a mailbox where the mail is still delivered you know like er have coffee in the morning and then you check it and then there's nothing else but er er the reason for this was that the effect of these interruptions er was er extremely detrimental and er well i don't have to <COUGH> er elaborate on this point er if you ever were working on on something where you had to kind of preserve the flow of your work and and keep the thread of what what you were doing you know how even a single phone call or someone knocking on your office door er er er what an interruption it can create where it takes you disproportional amount of time to recover er there are some studies that show that recovering from interruption like this er takes up to 15 minutes that you're at same level of er that you were at before so <COUGH> i started thinking then <COUGH> about the of how these environments then <COUGH> can be modified er er in a way that makes them less er intrusive in our lives and er <COUGH> w- the the first thing the the reason why er these er devices are , disrespectful and intrusive is because they really don't know what you're doing and er <COUGH> so s- some of them don't know even that you're there and they're doing something that you the others er er don't know what you're currently doing and they are trying to claim your attention like somebody's calling you on the cell phone so <COUGH> i <COUGH> tried to then er define <COUGH> er er the detection parameters what are the parameters er er of atten- er attentive environments they're related to detecting a living human being and of course the <COUGH> the first one er is detecting someone's presence and er er if you er er think about any form of communication any form of interaction er the most important factor for establishing any kind of communication is detecting the presence of the other party that's communicating so if i walked into this room and no-one <COUGH> was here er i probably wouldn't be talking but i do now <COUGH> er or maybe i would <SS> [@@] </SS> [you don't know everything about me @@] so er and we take this for granted this er knowledge that someone is there because it's a prerequisite for interaction for communication howe- however as as what you saw with my computer that that machine although it looks nice and it's so small and handy and so on er er er doesn't even know that i am there trying to use it and er you'll also see from some of the examples i'll i'll provide you with that er er detecting presence er is not rocket science er er that they could be er done very cheaply and there are many other devices not as sophisticated as er computers that er have this er ability in them the next level <COUGH> of detection is er which could er allow us er to build more intelligence into these interfaces is not just de- detecting someone's presence but their relative distance from the attentive object er er the proximity and why is proximity er er so important well because er proximity er is one of the signs of users of human interest so if you're interested in something then i just saw this model of a computer that's the same model but much better than mine <COUGH> er you come closer to it you examine you examine it so <COUGH> proximity not only then indicates that you're there but it also indicates <COUGH> that you have an interest in this particular object or <COUGH> artefact er the third level as i see then and and er feel free to rearrange this list or to add or or subtract some of the er items er the the third er er factor in my hierarchy would be establishing then the identity of er whoever is there willing to communicate and interested in something <COUGH> and er er there are number of er technologies that exist today er er that er can er very er reliably <COUGH> er er establish one's identity er whether it's based on a a specific er voice-print voice-pattern or your er the pattern of your iris or your thumb-print or er some technologies now use the whole hand-print er the more sophisticated ones er er are establishing identity through facial recognition and even more sophisticated ones er through the er very er revealing an individual individual patterns of one's gait so when i'm walking although two people may seem they walk alike they make steps they move from point A to point B just because of the fact that our bones are of slightly different length and we carry slightly different mass on them and it's distributed in slightly different place er it creates a certain pattern which is very er individual er so there are now er er applications that can establish your identity even if you're half a mile away just by the way you walk <NS2> mhm </NS2> it'll recognise it's you er and it doesn't care if you're wearing different clothes or a big coat or whatever <COUGH> er what is er <COUGH> really interesting with the er er er and it goes into the er er into other er parameters is that er the new er er applications and they're mostly <COUGH> used for security purposes like to establish identity of access to certain buildings rooms or computers that er er there is not only pattern er that's highly individual for each person er but there is also an intentional pattern that you can detect so er in a er some of the supermarkets now and and big stores that can afford to buy the software er er they're using the er theft detectors er er er which actually the software is monitoring the way people move around the store and it seems that there is a particular pattern if i had the intention of stealing <SS> [@@] </SS> [your pen there <COUGH>] that i suddenly start moving completely unconsciously in a different way which is ever so slightly different from my normal way of moving although i'll try my best to hide my intent but er er obviously this can be er picked up er er and er so what now you have these like er store guards just sitting in front of a computer screen and and they're they're (xx) outline cams er er that identifies er these thieves before they become thieves which is in a way nice because they can then <SS> [@@] </SS> [the the guard] can just walk out and and and deter them in their intent never really knowing whether it would happen or not but <NS2> @mhm@ </NS2> it's still nice <NS2> mhm </NS2> okay the next factor er is er detection of er er activity and this activity can be er er done in different modalities er as er commands <COUGH> er and er they don't have to be vocal commands they can be typed or er detection of of gestures er of er of voice commands and so on er the activity that provides the er er ability to detect and er understand an activity er allows then er the er application to actively communicate er with the user through the interpretation of a user's er explicit intention er because here we're using er er the activity as a trigger for an action so it it's not very different from pressing the enter key on your keyboard but it's much more natural er if i can just er show , point to something literally and select it than move the cursor (atop) <NS2> mhm </NS2> er the , next factor in in my hierarchy is ability of these applications er to detect the er focus of attention and er i divided these er er er i specifically looked at the current focus er of attention which is er er very often tied to the functionality of er the design itself the functionality of the interface itself where you have to er focus on certain things that are er doing something er for you er and er if the application is able to detect er you're focus of attention then in a way it can prepare er it can make the execution easier for you er er i can give you an example like that er er er now there's a experiments with er <NS2> mhm </NS2> er personal digital assistants er er like palm-pilots and also cell phones and anything that has a database er in it er where er you have to use them in different contexts including er driving in a car or or using only one hand and so on er that can detect your gaze <COUGH> direction and scroll down without you having to do er anything so er ability to to detect current focus of attention suddenly becomes er er very handy er and er allows you to do things in a more easier way <NS2> mhm </NS2> and er er the last one which i <COUGH> er the last factor that these er er attentive er environments should er er ideally be able to do er is to infer your intention er but the intention er in a sense of er er of your goal what is it that you are really doing because when you are calling someone er on the level of detecting current focus er you are just dialling a certain number er inferring intention would be er if you dialled this number four times in a row within the last five minutes then it thinks it's really important for you to reach this person and er that's the next step when the application can recognise urgency your goal that you want to reach this person at all cost that it's very important for you , okay <COUGH> just a little examples of a er from helsinki airport er @and and other@ airports i have visited in the past 24 hours er of er attentive environments and er as you can see the attentive environments come in all shapes and forms they don't have to be computer-based er and er er the most commonly er seen one er are those installed in the er er toilets of er of public facilities er er which have the sensors that detect your presence and er and either flush the toilet or er or s- s- or start a <COUGH> er turn turn on the water er i was er since i was really thinking about these (xx) and playing with all these i i think i became a terrorist suspect for some <SS> [@@] </SS> [guards because] i was playing with every single toilet <NS2> mhm </NS2> <SS> [@@] </SS> [and tried them out] <NS2> mhm mhm </NS2> and i realised that er <COUGH> that some of them are really good some of them really work as er as they should but some of them are horribly stupid and and er are wasting er water because er even if you you trigger the action just by very act of of being there it it detects your presence but then even before you have a chance to do anything <SS> @@ </SS> @@ they flush which is it goes to waste and then they they do flush again when you leave <COUGH> but i think they could have made them (xx) some of them are er and (xx) <SS> @@ </SS> it's strange on the same airport you'll find both kinds <SS> [@@] </SS> [and er] the one actually in helsinki is the one that does it two times like er on the domestic flights terminal <SS> [@@] </SS> [not on the international] , only i'm only speaking from men's room i don't know ladies' room @<SS> [@@] </SS> [other- other-] otherwise i would become even more suspicious@ er o- okay so now <COUGH> i would like to <COUGH> show you so er is er some er er a practical examples er of er what er i was just talking about er and again er er it would be in the er context of a mhm er er in the context of a er of art education and art er exploration so let me just er <COUGH> since er we have to s- have the same point of view here , i just change my point of view here , so in order to have the same point of view i have to turn my back to you so <SU> mhm-hm </SU> to be (authentic) er <P:08> so let's say that <COUGH> er that w- there is a painting in a in a museum or somewhere that you're er really er interested in er and er <COUGH> or so as i said before your your natural instinct is to come closer and to examine it oh uh hey , see no hands i i i'm just looking and and what it does <COUGH> it detects my intention er er simply by detecting my presence and proximity and er in this context what what it says is like okay this guy is interested in me let me reveal more of myself to this person and although this is a <COUGH> an example from a er er art education er er where indeed just by coming closer er to er an interesting artefact you'll be able to <COUGH> zoom in or it will display itself er er for you in better detail you can also imagine er other applications er for example er reading the fine print in some contracts er when you are certain age like me the reading fine print is not anymore just nuisance but impossibility where you you must have like er er ten feet long arms to <COUGH> achieve this er so it would be really nice if could just come closer and this fine print will will become a nice large font that you can really read or you can imagine an application where you are reading the blurb er in new york times website <COUGH> and you want to know more and you can know more just by expressing your interest in what's going on er the (xx) again that these are the <COUGH> examples er from art education here wh- what i'm really playing with er and this is er er they're not very high quality image er is are some of the parameters er of this painting in this case i'm literally playing with light er that can change the atmosphere and the interpretation of this painting from very menacing one like er someone is throwing someone's child in er at night in in this river creek or to kind of idyllic er summer-day scene <NS2> mhm </NS2> , another example again er playing with some of the er ways we perceive art is er er playing with the er ability to er be focused what one is seeing so i'm literally here playing with the focus of a er of this painting which is what we often do and it's especially it's like a a s- signature of a real artist like er er to you know er er how how do you call it not blink but er to , what is it called in english i'm not a native speaker but er i guess then it is not (xx) for native speakers er to squint no not to squint er <SU> squint </SU> oh there is <SU> squint why not mhm </SU> to squint <SU> [mhm] </SU> [er] <COUGH> so you can abstract like all the details and then look more at the composition and er er for and er what's really nice here is that <COUGH> i'm mapping this er er experience <COUGH> er onto the physical gesture er so in a way it it becomes er er it can literally leave an experiential memory trace er so there'll be something that would allow you to recall this later easier and label er and that's how it would become your er active er knowledge without this experience even though you may understand what's going on er er it would be very hard to recall this and and i'm actually er er er making a parallel with er er <NAME NS2>'s er er presentation er and the fact that er er there is the memory of activity is er er the basis of our knowledge or our ability to even to produce er er some er very complex patterns er that fitted er a long time ago <COUGH> er then you can go <COUGH> even further and er and start mapping er er if er the environments er can detect your presence and your intent er you can allow them to in- er to detect your gestures the meaningful gestures now i'm talking to you and i'm waving my arms and so on and here i i mean er in er er and i'll so everyone can see like i'll i'll just squat here i'm in a gallery and i'm <COUGH> looking at the different artefacts and i just make a gesture er which is kind of really natural browsing gesture er and er which allows me to browse through the contents of a a museum <COUGH> or a gallery er as if i'm reading new york times and er my design is a mhm mhm environment and i'll show you the er er setup later er as a er kind of universal browsing <COUGH> mechanism which would er allow you to look at any content of any kind it could be web pages or new york times or a database of images or even er er er music CD tracks and so on and er the beauty of this er approach is mhm that once you <DISC CHANGE> (xx) one once it's built er there are no there is no wear and tear there are no moving parts or all you are doing is moving air er and er nothing can go wrong which is er for many designs er er or in many public places that's a er er a big problem that er er frequent interactions with certain objects er er tend to destroy them fairly quickly , and er then a- another er application would be to create er er or to to allow not just browsing but er er examination of objects that are er er valuable or fragile and so on er and also tied to a certain experience and <COUGH> i actually designed this er for young children <COUGH> in er museums who are ostracised often from er having access to er er the artefacts because they're so valuable and they're guarded in er glass boxes and so on er and er child- for children then they're uninteresting they're boring because there is no er ability to interact with these er and er like this you allow er children to control the artefact er without possibility of er er destroying it or er doing anything to the artefacts so i'll just go back to my presentation and if you er if you completely disagree with something i'm saying or you <COUGH> you agree 100 per cent er feel free to interrupt me at any point and and and just ask if you have a question er i'd rather have them at the moment when they're relevant than than er afterwards mhm <P:15> okay so <COUGH> here is then er the <COUGH> the setup <COUGH> at the phoenix museum of art er i call it the the gesture gallery and er a- as you can see it it consists of a of this little kind of conductor's er podium er and er <COUGH> the paintings er from the museum collection er are projected on the wall er in their original size so which eliminates yet another problem that er er er er one has with the er digital reproductions of er images because when you look at a pollock er on a 600 by 800 pixel screen er it's not the same as pollock in the er museum and er since scale really matters er and is integral part of the effects some of these artefacts have er this mechanism allows preservation of of scale and it's really magical because with a minimal intentional action with a minimal effort er er you get this huge painting er er float in front of you and er er you get er appropriate er or age-appropriate er er voiceover . okay er just an example of of er some of the gestures <COUGH> that er we use and er er is this recognisable this gesture is this the like er er american-italian er saying what </S3>
<P:05>
<SU> don't know <SU> yes </SU> i don't [know] </SU>
<S3> [yeah] yeah exactly so well and you w- you just did the same thing <MOCK ACCENT> i i don't know i don't know <SS> @@ </SS> don't ask me my dear </MOCK ACCENT> who am i to know about it @@ and er , i wanted to dress like this and i figured that they told me it was really cold <COUGH> <SS> @@ </SS> er this was taken at a lecture @about attentive interfaces@ <SS> [@@] </SS> [@@] (xx) <SS> @@ </SS> <P:05> erm just a <COUGH> a a little overview of a er then <COUGH> i will i show you the example of the er detection of proximity and the gestures and the next one was the er er identity <COUGH> right now they're er commercially available like a number of er er devices that er use different approaches from er er looking at a a retinal pattern of blood vessels to iris and er facial recognition finger and hand-print and and voice and voice-pattern and er which would include an accent er er there are actually er there there is a whole line of products now er that er allow only <COUGH> certain users to access an er er computers er and er the one that's very interesting is the the er er fingerprint-mouse where you have a er a mouse where on the side <COUGH> of the mouse where the your thumb is er is a er a sensor which er detects your thumb-print and knows that it's you and then the computer works if someone else picks up your mouse and tries to work er it er <COUGH> won't allow him and you can pre-program this to accept a number of different users and and so on <COUGH> okay so now i'll <COUGH> move to what is the <COUGH> er kind of a overarching topic of this er <COUGH> seminar er and and this is er how to use er er er gaze <COUGH> and eye-tracking er as a part of building these er attentive interfaces again er er this will be the context of er interacting with a er works of art but can be generalised <COUGH> so just a a short and i'm i'm sure you hear this like ten times today er like about the <COUGH> er gaze direction and eye-tracking er but from a biological point of view the er gaze direction <COUGH> where we are looking at is the oldest and actually the earliest means of communication er which you can observe in er young parents or baby well they don't necessarily have to be young but sort of new parents with young children er who are trying to inf- infer their er their child's er intention er by looking er at their gaze direction and and very often they actually er even test the child by saying do you want this or do you want that and they look like which is more attractive and <COUGH> and then they say oh no she wants this she looked at it no no no no she was looking at that and <COUGH> er so it just means you need more sophisticated ways of detecting the gaze direction and luckily now we have so <NS2> mhm </NS2> actually did you kn- know about a little (xx) <COUGH> talking about inferring the er intention one of these er er techniques er actually the the er the voice-pattern er is er now being sold in the US er there is there are two things you can buy a pet translator er dogs it seems use only like er er 20 or 30 different patterns to express themselves it doesn't matter the size or the breed er and so they built this little device which you keep in your hand and and your fido or whatever the name is says <IMITATES BARKING> and then you look at it in the i want go for a walk <SS> [@@] </SS> [or pet me] you know like er kind of you know er infantile voice and er <SS> @@ </SS> and these people swear er they did er like interviews but probably it's just barking but there were these incredibly happy pet owners who finally <SS> [@@] </SS> [knew exactly] what their pets were telling them er if you were a cat person er er i'm sorry but it turns out that cats can make such a wide range of noises er er much more than dogs and they're so individual that there's no cat translator but another company immediately made a baby translator <SS> [@@] </SS> [human babies] don't have a huge range of er er expressive ability they are more like dogs than like <SS> [@@] </SS> [cats er so] there is this for young parents who are scared who don't know like oh she's crying all the time or is she hungry or (xx) maybe er there's this translator which resolves everything er by pressing the button it says feed me take me out i love you or i don't know it has ten <COUGH> commands and people really like it er anyway this is just a gadget don't let me wander off with these like next time you notice that i'm veering away from the subject please gently ask me to go back <COUGH> er so er not only it's biologically the er oldest and earliest means of communication er the gazer action can be er huge er as a first instance er of pointing when the child is actually physically unable to point because it's not coordinated f- enough er using the gaze and the looking at a at certain thing er it expresses its er interest which comes in really handy in terms of er talking about interface designs er where one of the main er actions is pointing and selection and clicking and dragging and so mhm er in humans er the <COUGH> or the biological signifin- significance of gaze direction er er is er can be documented er and er it seems that in humans er this way of communicating is much more significant than in other mammals including our close relatives like chimpanzees and you can see the er very obvious difference between a a chimpanzee eye and the human eye that er chimpanzees and for that matter dogs and er most of other mammals er are lacking er the the white <COUGH> area around the er iris the the sclera er which er actually makes the er gaze direction detectable at a great distance so er i can actually see er at the back of this room er whether you guys are looking at me or elsewhere er just because you happen to have this white area around <COUGH> your er irises er so mhm then for humans er this way of communicating using er eyes and gaze direction for communication er er er is a feature of a great biological significance and for us then it's a feature that er one can use in creating er devices that er can infer your intent er indeed er eye contact er er <COUGH> ability to maintain er er gaze to interlocutors is <COUGH> one of the first behaviours that develops in young infants and er eye contact could easily play a very significant role in social communication throughout <COUGH> life and here are just some of the examples of er the role that eye contact plays er for example just by using your eyes your gaze direction <COUGH> you can er interrupt a person or regulate the conversation flow so when i'm talking and then i've finished my sentence er i establish eye contact use signalling that now it's your turn and <COUGH> and that's why if you want to be rude then you avoid establishing eye contact with the person who is er talking to you it also <COUGH> er is a very good regulator of intimacy levels so the more intimate the relationship is the more of direct eye contact occurs and er that's er <COUGH> also very evident if you look at any corporation where there is a hierarchy er er of different statuses er you'll notice that er the greater the difference between two individuals in terms of their status er the lesser of direct eye contact er they have er it er indicates interest or disinterest er with eye contact we're receiving feedback we are expressing our emotions like when i say look what happened <COUGH> er we are er er influencing <COUGH> other people by signalling er our social status and hierarchy so er if i'm higher in the hierarchy er i'll be looking at you and you'll be averting your gaze from er er from er me and er in this way indicate er er dominance or submissiveness er so if i'm er er i i have a dominant role in certain interactions and er i'll be the one er looking at you and you'll be the one looking elsewhere so you can see even the <COUGH> the list can go on <COUGH> much er longer but er we are er all the time actually communicating <COUGH> er er very richly very complex er messages er just by using er er the gaze er direction the focus and the duration of the focus , so <COUGH> er my basic premise then is that er the eye movements er in humans are social regulators and er since er i'm my interest is in the works of art and er er our relationship with er works of art is er also social er and very intimate one er then one can say that <COUGH> if you want to design an interaction mechanism with er works of art er then the fact that the gaze behaviour our innate eye behaviours er are significant regulators er of social interactions er then er er one can use then detection of our gaze behaviour er in creating a reaction initiating an action which is external to the viewer and i must er er er tell you that this is a a very a it's a very different view from a the er one established mostly in eye- and gaze-tracking er circles er er wh- where wh- well you can track eye movements and er er visual path er patterns and so on er most often er er the research stops er where you want to use eye movements as a kind of triggers of actions because er er the claim or the perception is that er we don't use our eyes in in er real life er to trigger actions and i don't think it's er er true because of the examples i gave you er and er we actually very often use our eyes and gaze direction (xx) er then the other problem then <COUGH> in er , using this as a mechanism is er er and i'm sure those of you who are in eye-tracking er know this is the problem that is called the midas touch erm er king midas er er would turn everything into gold whatever he touched and was happy for the first time and er until he realised that even the food that he touched turned into gold and thus became <COUGH> inedible er er with er using the eyes eye movements as initiators of actions is hard because you cannot not look at things that you're looking at it's a simple fact which means that you're (tying) something to just looking at things er then you have to create a mechanism where some kinds of looking at things will be observation just looking and sometimes looking at the things would initiate actions and it's called the midas touch or the clutch problem in er @@ in eye-tracking i'll actually show you some of the solutions er how to <COUGH> to resolve this er but before i show you some of the er designs wh- what what's time time when when should i stop or how fast [should i talk] </S3>
<S8> [er] er probably quarter to </S8>
<S3> quarter to </S3>
<S8> five </S8>
<S3> five so another 20 minutes right okay i'll talk very fast now so <SS> [@@] </SS> [@@] er er we can you you've heard some of this information so it's i can really cover it er fast and there is a contrast with our subjective experience er like if you look at (xx) eyes move er smoothly and you just absorb what's out there but er when you observe human eyes you see that they are er twitching all the time er and er er moving at very er fast er speed er and then just staying put for very short periods of time the er those times when eyes are still they're called fixations er and the actually eyes spend much more time jumping around than er fixating objects er the paradox is that during er jumping around during these saccadic movements er the eyes are not taking in any visual information so they're taking in visual information only during these very short fixation points and area of their focus is very small and as <NAME NS2> said it corresponds to the size of your thumbnail when you extend your arm like this so er the only area of clear vision er is this big er at this er distance er there was one thing which i actually er <NAME NS2> (xx) er <NAME NS2> was er speaking er the necessity for our eyes to move er and er er you say it's er in order to pick up the information er but actually our eyes are moving even when we are not picking up information if you're just staring at one point er and er the er other reason for our our to move er is er comes from er <COUGH> er another physiological mechanism er er which is called habituation the fact that er er if you have the same stimulus all the time then the brain stops perceiving this stimulus because it's unimportant there's no change er so if you're wearing glasses only in the first moment you put them on you feel them then you forget about them and actually it's every day many people are looking for their glasses while they're on top of their head like and they're searching er for them or wearing the clothes of rings or earrings or or shoes now whenever i mention these things you may become aware for the moment er er er but then this awareness er vanishes and er the same er principle er goes for other senses like sense of er smell for example otherwise people who col- are collecting your garbage would not be able to do this but sense of smell adapts very quickly so they completely lose er er er it's only first couple of minutes that you can feel the smell and then you adapt and you don't feel it anymore that's why some people don't know that they smell because they've adapted to <SS> [@@] </SS> [their own smells] and they're surprised all the time like why are people running away <SS> @@ </SS> er the same <COUGH> habituation happens er for the sense of vision er which means mhm that er if er there was a constant er and steady stimulus that coming to the sense of vision then it would disappear then the question is well if you stare at something why doesn't it disappear er well because your eyes are moving all the time er and these movements are really to prevent the world from disappearing in front of your very eyes because you've focused onto something for two minutes and if you don't think it's true you can look at er studies done by er doing exactly this making contact lenses with er different patterns on them like stars or crosses or something that people would wear and that would move with their eyes s- which means they would produce exactly the same stimulus on the retina and er what happened was they would see these patterns for a first couple of minutes and then they would disappear because of the habituation er less dramatic example is like wearing glasses or when you start wearing glasses you see the frames all the time and then they just disappear and you don't see them unless you consciously pay attention <COUGH> okay <COUGH> er an example from a a psycholinguistic experiment of of how eyes move and i'll er just point out that er er the er er this is so-called a garden path sentence that er is used in er er studies of er cognitive processing and er er er it's called garden path because the the first pass like reading through a sentence er at the beginning er kind of er er tempts you to go into the wrong direction er of interpreting the sentence and er you have to come back and you see the saccadic movements or the fixations here go one two three four five and then they go back to the er er part of the sentence that helps you disambiguate the <COUGH> er er meaning and which also er presents the er er er the most er er , it's most processing demanding part of the sentence so er it has been well documented that er the duration of fixations er in certain contexts is directly proportional to the cognitive complexity of what you're trying to er er achieve or extract from the environment , er i'm sure you are familiar with this er picture w- w- this this further complicates like er any eye-tracking er er studies the fact that er er er in reading you can er detect very er er definite patterns er of eye movements that er are actually stereotypical they don't change with different er users er however er er if you have different users er or viewers here and change the context then you get a er fairly different er er visual path er patterns and and you see here that er er free exploration yields one path like the question is what's the age of these er individuals and you will (set on) your path and i forgot what the other ones were that that <COUGH> were even more different , mhm er eye-tracking <COUGH> has been used now for er , analysing er er mostly the effect of a er of different marketing er materials in in this case er er even though their the faces er or lower parts of the of the faces of er of these individuals were covered er the the focus er er was still on er on their er faces and er there was a a definite er er er knowledge of hierarchy who is the doctor who is the nurse e- even though they were similar <COUGH> er er this is a er er o- one of the studies done by velichkovsky er er actually a a long time ago maybe er ten years ago er in a er analysing the , the eye movements er in er looking at the works of art in this case arcimboldo's er complex er paintings and er what velichkovsky coined er er the term that he coined was er the attentional landscape <COUGH> where er by looking at the points of er fixations and adding them up er you can er to some extent er er capture the information that was available to the viewers w- when observing er this particular er er art work and i think that er <COUGH> er this becomes extremely important er because er , er if you know exactly what features of the er observed object er were truly detected and processed by the user and then you can observe user's behaviour and collect additional data through interviews and so on then you can get a pretty good er er er insight into the cognitive processes cognitive and perceptual processes of different users in other words er you can build the context er where er these applications would know , almost what the users are thinking er er because if you create a constrained environment er er and then you have this knowledge of perceptual input you have a knowledge through interviews of er er personal er interpretation or analysis and so on er well then you can really go very far in designing these er attentive interfaces er i'll just skip these these are just er <COUGH> er ways of illustrating the different er er points of fixation that were shown before er and the their use <COUGH> would be in designing er er or providing all the information that's er that's necessary er er humans focus mostly on on the facial features er on the eyes and the lips er and so er in er er er creating photos er er you can get actually a much more vivid er impression er by following the human perceptual er er er pattern er and er so this person actually looks to humans more realistic than the other person because er the other details there er that are er displayed with the same acuity with the same (xx) are actually er competing with what we are interested in what what our actual perception er would be er eye-tracking is used in analysis of different er interface interfaces er for many of the er er very complex er systems like er like er like electrical power plants and nuclear plants and so on er <COUGH> and now <COUGH> what er i would like to i already talked about the the problem which is the midas touch problem mhm and er , here is a a solution that was done in 1995 er these are the eyecons er the er gaze aware er er eyecons which er er the more you look at them they signal their awareness er that you're looking at them er and er after some period of time they trigger an er action , then er the other problem with the <COUGH> with using meaningfully er eye-tracking and and which er er many of you here know is is that er that er you get huge amounts of data like er if you use any eye-tractor it's detecting <COUGH> er the eye movements and the gaze er er fixation points er sometimes 200 times a second and you can imagine just five minutes of <COUGH> collecting this data produces this er huge amount of of er numerical data and er very often er people don't know what to do with it and er how to classify it because er you can just count the fixations and say oh there were so many fixations you can look at their locations you can use them together er you can pronounce that er all fixations <COUGH> in the same region which you have to define er are er called gazes that you gazed at this area then you can talk about gaze duration and so on so right now we are <COUGH> at a er er at a point where this er terminology the vocabulary er of our understanding of the eye and gaze behaviour we are just developing this and er many er researchers are actually inventing new er ways to meaningfully quantify er this because on its own like l- just looking er at a path and if it doesn't it doesn't give you any information a single fixation is 300 milliseconds then it's meaningless er especially among 10,000 other fixations okay the er this is a a er an attempt that was done er er in london in the national gallery in conjunction with a er er with the institute of behavioural sci- sciences in derby er and the where the visitors were observing er er works of art to my disappointment they were not observing actual works of art but er on a computer screen which i didn't know because it was nowhere explicitly mentioned so i was living under the er illusion that er oh this is a great study because they were actually looking at the paintings and then i realised it was just on a screen but they did very interesting studies because er er like here this er mhm mhm i think that er that the data here er er show the er superimposed er er er s- visual exploration paths er i think in the end it was more than 4,000 er users so er er you could actually see how stereotypical visual exploration patterns are and er now since i'm running out of of time er i'll stop talking and er rather show you er er an experiment eye-tracking experiment er which i <COUGH> er developed as a solution er to this midas touch <COUGH> er er problem and er er er it's it should really be a solution to how to use eyes and gaze er er as an interaction mechanism and and not just collect the data that are useful to us and er what i brought my mini eye-tractor with me which i carry with me all the time uh-huh i see <NAME S1> is very interested <SS> [@@] </SS> [@@] it er <COUGH> detects eyes through these holes <COUGH> and er <COUGH> i'll explain later it's not really like kind of a <COUGH> it it does the the (xx) so <COUGH> , and i'll assume the <COUGH> your point of view so you can really see what what's going on <COUGH> <P:06> okay , and er <P:07> so here's the <COUGH> application and here are my my hands <COUGH> and i'll put my mini eye-tractor on and er you can see that er i now , (xx) oh sorry (xx) erm so er wherever i'm looking at i'm i'm er well this cursor is actually showing the where i'm looking at the screen and you see if i look at the er er at this er drawing on the screen that er the cursor changes to this focal point indicating that er this er artefact is gaze-aware it knows that i'm looking at it and then er if you want to <COUGH> er er do some action er what i created here is a er i created a place where you can choose what kind of action you want to be performed then you focus on certain part of this drawing and er here i'm focusing on the hand , and you see that the the cursor signifies zooming in and i can observe this er er er drawing er by wherever i look at there is this er giant magnifying glass which er er is attached to the er end of my gaze and let's say i've er examined this enough and i want to er get er another er er artefact and i want to perform some other action so er er i'll choose the action on this palette and this time instead of er zooming in what will happen is the i'll hook up the whole artefact er at the end of my gaze so now the whole thing is moving with my gaze and er if i move if i look suddenly to one side er i can flip this er er er painting i can literally throw it away left or right and get er er another one and examine the content of the whole er er gallery and er so what i just er showed you er is er an example er of using the er eye gestures and i call them er mhm er er eye gestures as er action mechanisms and er i'll just go back to the er er presentation for er er one quick minute er and then i'll finish er because what's important here is that er i think that so far er most of the interactions with the er that used eye and gaze-tracking er were time-based er or place-based er which means that you had to stare at one er location for a while er and er er while it works like you can stare like that's how the the er keyboard on the screen of this paralysed person where like you stare at different keys and you stare at one key for a prescribed amount of time it triggers and so on er but after a while it becomes extremely frustrating er that every single time you want to do something you have to wait you have to wait so i er proposed these er er i call them eye graffiti <COUGH> er which are er really eye gestures for example observe my eyes now that i'm talking like can you see what she's wearing today , er so i didn't say anything but you get the meaning because i'm using er the er eye gestures er to indicate something er er like oh they're making so much noise but er oh er and the <COUGH> which means that er i can actively use this these gestures in communication so i kind of mapped these er and they're very simple er er actions er if you are if you look suddenly to the left or to the right you throw an object if you er look suddenly up when you're fixating an object what you do you literally hook it up to the end of your gaze and then you can position it elsewhere and er <COUGH> er and this all depends i've actually er described this er in a paper that's available on the er on my website er so if you want to er go through the details of this interface you're more than welcome to er look at this er er paper er one more slide and and <COUGH> er , er this is a suggestion for a er eye-tracking er application er where er er actually i was asked to design this for a foundation in india er where there is this huge mural er er detailing <COUGH> er different er episodes from er gandhi's er life and what they wanted er was that something magical happens that because there's so many of these details and er er each one of these has a different story (behind it) er they wanted a <COUGH> an application where depending where the viewers were looking at which particular part of this mural they would to get the different voice-over so <COUGH> er this is this was my er er suggestion to them building up this er er er installation where you would have this kind of oval window er because this is meant for a single viewer single user although er just because of the price it's it's entirely possible to track eye movements and gaze direction of a number of people simultaneously <COUGH> but er what they wanted to pay for this was not @@ that much so this is a a single viewer er solution and i actually have built an application and er er tried it to the with an eye-tractor that i released and i'm still waiting for er the answer from them so if it doesn't go through if anyone is interested here and connected to er museums and so on let me know i'll be happy to er build something like this for you well thank you very much my time is up oh i'm overtime [@@] </S3>
<SS> [@@] </SS>
<APPLAUSE>
