Pages:     | 1 |   ...   | 19 | 20 || 22 | 23 |   ...   | 54 |

Event based syntactic-semantic video model (we call it as, ESSVM) (Ahmet, 2004) proposes Actor entity to specify the context dependent role of a player in soccer sports. This model represents the events such as free kick, goal, penalty etc, in which player assumes different roles such as scorer, assist-maker etc. But in dance videos, the contextual information of the dance events has to be described at multiple levels like actor and agent, rather than at a single granularity of actor entity.

COSMOS7 (Athanasios, 2005) models objects along with a set of events in which they participate, events along with a set of objects and temporal relationships between the objects. This model does not model the temporal relationships between events and the contextual roles. It models the events at a higher level only like speak, play, listen etc, whereas dance video model needs more detailed level of event representation such as agents, their action, speed of action, associated song and so on.

3. The Semantics of Dance Videos Generally, dance information is dominated by visual content such as steps, posture and costume and the accompanying audio such as song and music. Hence, dance videos are rich in semantics and provide ample scope for the efficient semantic retrieval and dance video mining. This section illustrates the song that accompanies the dance performance, the different dance video types and the features of the dance videos in detail.

3.1. Song Granularity Dance video contains several dance steps representing each song. In the case of classical dance, it is simply a collection of songs choreographed on the stage or theatre with a single start-stop (Cheng, 2003) camera operation. On the other hand, in a movie dance, a movie contains several songs and for each song dance steps are choreographed by the dancers. A song in a movie may be recorded with multiple start-stop camera operations. For instance, an Indian movie will normally contain about five to six songs. Here, each dance step may represent a step from any of the Indian dances or a new step innovated by the choreographer. Further, it includes the presentation aesthetics such as mood, feelings, emotion and so on.

Song is composed of four parts: Introduction, Additional Introduction, Chores and Stanzas (Web of Indian Classical Dances, 2003). Depending on the type of a song, Additional Introduction and Chores may be optional.

Each part has few lines of lyrics for which dance steps are choreographed. In the dance video hierarchy, a shot represents a dance step, a scene represents dance steps of any of the song parts which are recorded in the same location and a video clip represents dance steps of a song. Our DVSM will represent the semantics of one dance step as a dance event. Dance step is the unit of analysis in this paper.

3.2. Features of Dance Videos There are two types of dance video features- macro features and micro features and are annotated by the human annotators at macro and micro levels (Forouszan, 2004) accordingly. Macro features are general properties of the dance that are event independent and micro features are the properties of the dance step. That is, micro features are spatio-temporal characteristics of the dancers while rendering the dance steps. Micro features can also be called as event dependent features.

Fourth International Conference I.TECH 2006 Macro features(or Bibliographic features): Date of recording, time of recording, geographic origin of the dance, geographic origin of the dancers, sex, age, number of dancers in a dance, type of performance venue (such as theatre, open-air, etc), type of the accompanying song, type of accompaniment, type of musical instrument used and types of dance videos. The different dance videos are movie dance video, theatre dance video, folk dance video, classical dance video, street dance video and festival dance video. These macro features are independent of the dance steps and are common to all dances.

Micro features (Dance step specific features): Spatio-temporal features classify dance movement behavior which include: movement of one dancer in relation to another dancer, movement of a specific body part (such as eye, leg etc. Refer Appendix-A for a complete list) of a dancer in relation to another part of the body, movement path of the dance (such as circular, linear, serpentine and zigzag), distance between body parts of a dancer while performing a dance step and distance between dancers.

Hence, the proposed video model has to characterize a set of macro features and micro features that exist in the dance videos.

4. The Dance Video Semantics Model Conceptual model abstracts the dance video data into a structure for later querying its contents and mining some interesting patterns. For efficient conceptual modeling, one should know how choreographers demonstrate a dance to the learners. They are the experts in describing the rhythmic steps to the audience. This section presents a generic dance video model that efficiently describes the dance steps. Every dance step is called as an event and the model represents dance events by a set of micro features. The model is generic in the sense that it is applicable to any type of dance videos. DVSM is an extension of ER (Chen, 1976) with object oriented features. The goal of the model is to describe a dance step as an event.

The main entities of the model are events, objects that participate in these events, actor entities that describe contextual roles of objects in the events, agent entities that represent the action of the actor and concept entities that model the cognitive and affective features of the dancers.

For example, consider a dancer object with name, age, address and all other event independent attributes. The same dancer assumes different roles throughout the dance video. That is, he becomes hero in one dance step, lover in another dance step and so on. Roles are defined as attributes of Actors. Some other examples of actors are heroine, leader, follower, group dancer, friend etc. These context specific object roles form separate actor entities, which all refer to the same dancer object. Although one would say that actor performs the action in an event, finer granularity is necessary as far as dance videos are concerned. Therefore, contextual data of the dancers have to be described in two levels. A particular dance step is characterized by the actions of the agents who belong to the actors. Spatio-temporal characteristics are part of the actors as well as agents. Hence, they are described as attributes of actors and agents.

Apart from the dancer object, DVSM may also represent the ordinary objects with a standard UML class diagram.

Some of them are: speed of the action of an agent, instrument used and the posture of the actor. The graph meta-schema of the DVSM is depicted in Figure1.

The graphical notations used in DVSM are described as follows. A rectangle node refers to an entity or an object.

A round rectangle node refers to a concept. A dotted rectangle node denotes an actor entity. A thick rectangle node shows an agent object. Event entity is modeled with a trapezoid. Attributes of entities and relationships are represented with oval nodes. Relationships are denoted with directed lines on which the name of the relationship is denoted. Relationships without their names represent the containment type.

The model is instantiated as a directed acyclic graph. The reason for choosing graphs is that it elegantly models repetition of dance steps and has matured as a graph database. If a dance step repeats after some time, it just requires another edge to point to the same node. A graph is formally defined as follows: Let G = (V, E) be a Knowledge Engineering directed acyclic graph, where V denotes set of vertices and E denotes set of directed edges. The different entity classes, events, actors, agents, concepts, and other basic classes become vertices of the graph. Similarly, the set of interaction relationships will be denoted as directed edges of the graph.

Joy Dance step Joy C C,T,SEM Concept Event Left, East Hero Heroine Approach S,T,SEM S,T,SEM Diverge Left Hand Right hand Sitting Actor Agent Raise Show Flower S-Spatial T-Temporal C Object SEM-Semantic Slow Speed C-ComposedOf Figure2: Graph of dance step containing actors and agents.

Figure1: Graphical representation of DVSM The conceptual representation of an event highlighting a dance step as an instance of the graph, is depicted in Figure2. In this figure, the dance event consists of two actors whose roles are hero and heroine. Hero is standing left to the heroine initially and facing east. Heroine is sitting and facing west. Now hero approaches the heroine.

These spatio-temporal semantics are stored as relations. Event independent characteristics of the actor are stored as video objects separately (not shown in the figure). The actors express joy and it is initialized as emotion. Hero raises his left hand to chest level with medium speed and displays a flower to the heroine with his right hand. Heroine remains idle without performing any action. This dance step is choreographed as part of one line of lyrics of a song. Due to overflowing of nodes, attribute nodes are not shown in this figure. The entity classes and relationships of the model are formally defined as shown below:

4.1. Event Entity Class Dance step of a song is known as a dance event. For instance, consider a Bharathanatyam step Samathristy (Saraswathi, 1994). It is performed with the eyes by keeping them static without blinking. This step represents a thought, firmness, surprise or an image of an angel. Also, dance events can be combined to form a composite dance event. As an illustration, consider a dance step, Chandran. This step represents a moon and is a combination of two other steps: Pathagam and Lola pathmam and should be rendered concurrently. Pathagam is performed by keeping the thumb closed and the other four fingers straight and denotes clouds, air, sword and blessing. Similarly, Lola pathmam is performed by keeping all the fingers open and stretched and represents a sun. Composite dance event many represent events which are rendered concurrently or sequentially by a dancer.

Dance events are composed of actors, posture of the actors, cognitive state of the actors and the interactions in space and time between agents and actors. Formally, a dance event is described as a tuple, Event = { EID, D, AL, ND, ML } where, EID is a unique identifier of the dance step, D is the description of the dance step, AL denotes the list of actors, ND is the number of dancers who are performing steps in the event and ML is the media locator of the video clip. Here, this Event tuple corresponds to Event object of Figure1.

Fourth International Conference I.TECH 2006 4.2. Basic Entity Class A dance video object refers to a meaningful semantic entity of a dance video database. It can be described using attributes which can represent macro and micro features. Formally, it is defined as shown below:

Object = { OID, V, TY } where OID is a unique object id, V = {a1:v1,..., an:vn } is n event independent or dependent attribute value pairs and TY = { AID, AGID } denotes the dependency of the object, either actor or agent (to be defined later).

For example, hero is showing a flower in his right hand. Here, ShowFlower is the object in which V denotes attributes action and instrument with values show and flower respectively. TY holds ID of the agent, Right Hand which belongs to the actor Hero. In this case, V represents event dependent (i.e., dance step dependent) values.

Similarly, object may also represent any of the macro features.

4.3. Actor Entity Class Actor is a spatiotemporal entity in dance videos. So the existence time (Vijay, 2004) can be associated with the entity and it represents the life span of it. Actor is also a spatial entity. Therefore actors displacement in space is modeled using Trajectory Points as in MPEG-7 (Martinez, 2003). Hence, actors are spatio-temporal entities playing context dependent roles in the events. Actors can have spatial, temporal and event specific semantic attributes describing their roles. The roles can be linguistic roles (Martinez, 2003) as in MPEG-7 or any semantic roles, such as loves.

The existence time predicate ACTOR, which is associated with the actor entity class, defines life span of the actor in terms of the existence time granularity (e.g. min and sec). ACTOR: S(ACTOR) Z B. This predicate takes a particular actor entity and a particular granule (denoted by an integer; say sec) and evaluates to a Boolean. If it is true, then that actor exists in the modeled reality at that granule (sec).

Constraint.1: Life span of an actor can exist only within the defined lifespan of the event to which it belong.

Formally, an actor entity can be described as follows:

Actor = { AID, EID, DID, R, L, T, P } where EID is the event id, DID is the corresponding dancer id, R denotes either semantic or linguistic roles of an actor, L is the existence time or lifespan, T represents the trajectory points(Point Set) as in Mpeg7 and P is the posture of the actor, which is a basic entity.

4.4. Agent Entity Class Agent entity class represents the finer spatio-temporal semantics of the actions. The agent entity is the one which is most important in dance videos. The essence of a dance step is the actions done by the actors and it is the agent that performs the action. This is an exclusive feature of the dance videos. All other video types possess just one or two agents, which are fixed and do not play any significant role at all. For example, legs are agents in soccer sport videos, bat and ball are agents in cricket sport videos. Agent entity elegantly models the action of the agent which belongs to an actor. For instance, left eye and right eye of a heroine are agents. Formally, it is defined as:

Agent = { AGID, AID, EID, L, T, X, S, I } where AID and EID denote the actor id and event id respectively, X is the action agent performs, S denotes speed of X and I is the instrument held by the agent. Also, L and T depict the lifespan and spatial trajectory, similar to actor objects. Here, X, S and I are all basic entity types as defined earlier.

Pages:     | 1 |   ...   | 19 | 20 || 22 | 23 |   ...   | 54 |

2011 www.dissers.ru -

, .
, , , , 1-2 .