Custom Search

Saturday, December 27, 2014

Theories of perception

Theories of perception


· Bottom up theory of perception
-Theory of direct perception (Ecological view)
Top-down and bottom-up theories of perception
Psychologists often distinguish between top-down and bottom-up approaches to information-processing. In top-down approaches, knowledge or expectations are used to guide processing. Bottom-up approaches, however, are more like the structuralist approach, piecing together data until a bigger picture is arrived at. One of the strongest advocates of a bottom-up approach was J.J. Gibson (1904-1980), who articulated a theory of direct perception. This stated that the real world provided sufficient contextual information for our visual systems to directly perceive what was there, unmediated by the influence of higher cognitive processes. Gibson developed the notion of affordances, referring to those aspects of objects or environments that allow an individual to perform an action. Gibson's emphasis on the match between individual and environment led him to refer to his approach as ecological. Most psychologists now would argue that both bottom-up and top-down processes are involved in perception.


Bottom-Up Theories
The four main bottom-up theories of form and pattern perception are direct perception, template theories, feature theories, and recognition-by-components theory.
Bottom-up theories describe approaches where perception starts with the stimuli whose appearance you take in through your eye. You look out onto the cityscape, and perception happens  when the light information is transported to your brain. Therefore, they are datadriven (i.e., stimulus-driven) theories.

Gibson’s Theory of Direct Perception
Gestalt psychologists referred to this problem as the Hoffding function (Köhler, 1940). It was named after 19th-century Danish psychologist Harald Hoffding. He questioned whether perception is such a simple process that all it takes is to associate what is seen with what is remembered (associationism). An influential and controversial theorist who questioned associationism is James J. Gibson (1904–1980).

According to Gibson’s theory of direct perception, the information in our sensory receptors, including the sensory context, is all we need to perceive anything. As the environment supplies us with all the information we need for perception, this view is sometimes also called ecological perception. In other words, we do not need higher cognitive processes or anything else to mediate between our sensory experiences and our perceptions. Existing beliefs or higher-level inferential thought processes are not necessary for perception.
Eg. “THE CAT.” Yet the H of “THE” is identical to the A of “CAT.”

Gibson believed that, in the real world, sufficient contextual information usually exists to make perceptual judgments. He claimed that we need not appeal to higherlevel intelligent processes to explain perception. Gibson (1979) believed that we use this contextual information directly. In essence, we are biologically tuned to respond to it. According to Gibson, we use texture gradients as cues for depth and distance. Those cues aid us to perceive directly the relative proximity or distance of objects and of parts of objects.
Therefore, as noted above, Gibson’s model sometimes is referred to as an ecological model (Turvey, 2003). This reference is a result of Gibson’s concern with perception as it occurs in the everyday world (the ecological environment) rather than in laboratory situations, where less contextual information is available.  Direct perception may also play a role in interpersonal situations when we try to make sense of others’ emotions and intentions (Gallagher, 2008). After all, we can recognize emotion in faces as such; we do not see facial expressions that we then try  to piece together to result in the perception of an emotion (Wittgenstein, 1980).

Neuroscience also indicates that direct perception may be involved in person perception. Mirror neurons are active both when a person acts and when he or she observes that same act performed by somebody else. Furthermore, studies indicate that there are separate neural pathways (what pathways) in the lateral occipital area for the processing of form, color, and texture in objects.


Template Theories
Template theories suggest that we have stored in our minds myriad sets of templates. Templates are highly detailed models for patterns we potentially might recognize. We recognize a pattern by comparing it with our set of templates.  We then choose the exact template that perfectly matches what we observe (Selfridge & Neisser, 1960). We see examples of template matching in our everyday lives. Fingerprints are matched in this way. Machines rapidly process imprinted numerals on checks by comparing them to templates. Increasingly, products of all kinds are identified with universal product codes (UPCs or “bar codes”). They can be scanned and identified by computers at the time of purchase. Chess players who have knowledge of many games use a matching strategy in line with template theory to recall previous games (Gobet & Jackson, 2002). Template matching theories belong to the group of chunk-based theories that suggest that expertise is attained by acquiring chunks of knowledge in long-term memory that can later be accessed for fast recognition. Studies with chess players have shown that the temporal lobe is indeed activated when the players access the stored chunks in their long-term memory (Campitelli, Gobet, Head, Buckley, & Parker, 2007).
Template-matching theories fail to explain some aspects of the perception of letters. We identify two different letters (A and H) from only one physical form. Hoffding (1891) noted other problems. We can recognize an A as an A despite variations in the size, orientation, and form in which the letter is written.

The Prototype Theory
Rosch (1973) and Rosch (1975) proposed that rather than having a number of predefined templates within our minds, we instead categorise percepts by referencing prototypes. Prototypes are similar to templates in that they symbolise outlines or ideas of what an object should look like, however unlike templates which require an exact match, prototypes rely on best-guesses when various features are in place.

Feature-Matching Theories
Yet another alternative explanation of pattern and form perception may be found in feature-matching theories. According to these theories, we attempt to match featuresof a pattern to features stored in memory, rather than to match a whole pattern to a template or a prototype (Stankiewicz, 2003).
The Pandemonium Model
One such feature-matching model has been called Pandemonium (“pandemonium” refers to a very noisy, chaotic place and hell). In it, metaphorical “demons” with specific duties receive and analyze the features of a stimulus
(Selfridge, 1959).
In Oliver Selfridge’s Pandemonium Model, there are four kinds of demons: image demons, feature demons, cognitive demons, and decision demons. Figure 3.12 shows this model. The “image demons” receive a retinal image and pass it on to “feature demons.” Each feature demon calls out when there are matches between the stimulus
and the given feature. These matches are yelled out at demons at the next level of the hierarchy, the “cognitive (thinking) demons.” The cognitive demons in turn shout out possible patterns stored in memory that conform to one or more of the features noticed by the feature demons. A “decision demon” listens to the pandemonium of the cognitive demons. It decides on what has been seen, based on which cognitive demon is shouting the most frequently (i.e., which has the most matching features).
Feature-detection has also been expanded to identify 'local-precedence' (Martin, 1979) and 'global-precedence' (Navon, 1977) effects. A local-precedence effect occurs when local (smaller or unique) features are detected in an image, whereas global-precedence takes place when the features form a larger image or a wider outline is identified. To better demonstrate this effect, take a look at the below image. You will notice that the 'T' shapes on the left are spaced so far apart that they stand out more as individual letters, whereas the image to the right stands out more as a larger 'T' even though it is formed of lots of smaller 'Ls' put together. This is because the 'Ts' on the left trigger a local precedence effect where less detail causes the individual parts to stand out more, and the 'Ls' on the right trigger a global-precedence effect where more detail comes together to form a larger, overall image.

Structural Description Theories
         Objects represented as configurations of parts (features plus relations among features)
         Retinal image used to extract parts
         Object-centered
         Example:  Biederman’s Structural Description Theory
Structural Description Theory (Biederman)
         Objects are represented as arrangements of parts
         The parts are basic geometrical shapes or “Geons”
         Object-centered
         Evidence:  degraded line drawings
One of the concepts that we’ve learned about that relates to a lot of my experiences is the concept of geons. Geons are part of a theory about how we recognize objects. The Recognition by Components theory, developed by Biederman in 1987, incorporates the structural description theory and says that there are 36 three dimensional shapes that all objects are made up of. These shapes are called geometrical icons or geons (or primitives). These geons and the idea that all objects are made up of them is very similar to the basic process of learning how to draw. I started drawing when I was really young. Like most kids I started doodling as soon as I was big enough to hold a crayon. But the hobby stuck with me and developed over the years. I was self-taught for almost my entire life and only took an actual art class when I entered high school. It was difficult at first to kind of unlearn the ways I was used to drawing and relearn some of the basics of sketching. Some aspects didn’t help improve my art at all so I didn’t use them as much. But the one important skill I learned that I’ve taken with me throughout the rest of my life was doing your initial sketching by using what are, essentially, geons. Visually, everything, including human figures, is composed of basic 2 and 3 dimensional shapes like squares, cirlces, triangles, and cylinders. Once you can visualize how this works, it makes drawing much easier. Take a human figure: the head is a circle, the shoulders and all the joints are circles, the arms and legs are rectangles or cylinders, the torso is an upside down triangle, the pelvic bone is an upright triangle, the feet and hands are ovals with thin rectangles protruding from them. Although a theory about how we recognize objects is obviously different than a skill used for drawing, the similarities made it easier for me to understand Recognition by Components theory because in a way, I’d been practicing a rudimentary version of it for years.


Structural Description Theories
Proponents of structural description theories pro­pose that objects are represented by parts and their spatial relationships, which together form a struc­tural description of an object. These descriptions discard an object's color and texture, for example, as the appearance of surface properties change with changes in viewing conditions (e.g., a change in lighting can change how color appears to an observer). The basic idea is that the same structural description can be recovered or otherwise derived from different retinal images of the same object. This robustness remains an appealing aspect of structural description theories despite the loss of surface infor­mation. Structural description theories have also been referred to as part-based or edge-based theo­ries, given their reliance on parts and edges.
The first viable structural description theory for human object perception was proposed by David Marr and Keith Nishihara. According to their the­ory, object parts (e.g., a cat's leg) are represented by 3-D primitives called generalized cones, which specified arbitrary 3-D shapes with a set of para­meters. For example, a cylinder can be produced by taking a circular cross section and sweeping it along a straight line. The circle traces out a cylinder with the line forming the main axis of that cylinder. By comparison, a rectangular cross section sweeps out the surface of a brick. More complex 3-D shapes can similarly be produced by sweeping different 2-D cross sections across different axes-One of the challenges faced by Marr and Nishihara was how 3-D generalized cones can be recovered from 2-D images. They suggested that an object's bounding contour—the outline of an object in a picture—could be used to find the axes of its main parts. These axes could then be used to derive generalized cones and their spatial configu­ration. Recognition could then proceed by match­ing the structural description recovered from the image to those stored in visual memory. Thus, Marr and Nishihara try to solve the in variance problem by recovering view-invariant 3-D models from images.
Following Marr and Nishihara's seminal 1978 work, Irving Bicdcrman proposed another influen­tial structural description theory in the mid-1980s—recognition by components (RBC). Biederman argues that objects are mentally repre­sented by a set of 36 components and their spatial relationship. He called these geons, for "geometri­cal ions." Geons are a subset of the generalized cones proposed by Marr and Nishihara, three of which arc shown on the top of Figure 1(b). The combination of these geons into structural descrip­tions can be used to create familiar objects like a mug. a pail, or a briefcase, as shown in the bottom of Figure 1(b).
RISC theory builds on Marr and Nishihara's structural description theory in two innovative ways. First, unlike generalized cones, geons only differed qualitatively from each other. For exam­ple, a gcon's axis can only be straight or curved, whereas generalized cones can, in principle, have any degree of curvature. Bicdcrman's second inno­vation was to propose a more direct means to recover geons from images. According to RBC theory, geons are recovered from nonaccidental properties. These are properties of edges in an image (e.g., lines) that are associated with proper­ties of edges in the world. To understand non-accidental properties, consider seeing a box from many different viewpoints. From most views, observers sec three sides of the box, which termi­nates in a "Y"-junction at a corner. This two-dimensional junction is an example of a nonaccidental property, and it is associated with a three-dimensional corner.
· Top Down theory of perception
Top-down Processing is an important perceptual theory in cognitive psychology. The theory establishes the paradigm that sensory information processing in human cognition, such as perception, recognition, memory, and comprehension, are organized and shaped by our previous experience, expectations, as well as meaningful context (Solso, 1998).
Top-down processing suggests that we form our perceptions starting with a larger object, concept, or idea before working our way toward more detailed information. In other words, top-down processing happens when we work from the general to the specific; the big picture to the tiny details. In top-down processing, your abstract impressions can influence the sensory data that you gather.

Top-down processing is also known as conceptually-driven processing, since your perceptions are influenced by expectations, existing beliefs, and cognitions. In some cases you are aware of these influences, but in other instances this process occurs without conscious awareness.

In constructive perception, the perceiver builds (constructs) a cognitive understanding (perception) of a stimulus. He or she uses sensory information as the foundation for the structure but also using other sources of information to build the perception. This viewpoint also is known as intelligent perception because it states that higher-order thinking plays an important role in perception- It also emphasizes the role of learning in perception (Fahle, 2003). Some investigators have pointed out that not only does the world affect our perception but also the world we experience is actually formed by our perception (Goldstone. 2003). These ideas go back to the philosophy of Immanuel Kant. In other words, perception is reciprocal with the world we experience. Perception both affects and is affected by the world as we experience it.


· Computational theory of perception
Marr used the term "computational theory** to describe this aspect of his approach to visual per­ception. The term emphatically does not mean a theory that is just "something to do with com­puters". Instead, it expresses the specific and very powerful idea that the first stage in understanding perception is to identify the information that a perceiver needs from the world, and the regular properties of the world that can be incorporated into processes for obtaining that information. In other words, we need to know what computations a visual system needs to perform, before attempt­ing to understand how it carries them out. In later chapters, we will see examples of Marr's applica­tion of computational theory to problems such as detecting the edges of surfaces, perceiving depth, or recognizing objects. The approach has been widely influential; we saw an example of the same way of thinking in Chapter 3 (p. 57) when we discussed the possibility that cells in the visual cortex act as filters tuned to statistical regularities in images of natural scenes. Indeed, the computa­tional approach even brings some common ground between Marr's theory and that of Gibson (sec Chapter 14. p. 408).

Computational theories of perception can be applied not only to human vision but also to other species, by considering what information an animal needs from light in order to guide its activ­ities. 

No comments: