Top Links
Journal of Ophthalmology & Eye Care
ISSN: 2639-9296
Visual Maps and Visual Perception
Copyright: © 2021 Gattass R. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Related article at Pubmed, Google Scholar
Purpose:This review aimed to provide electrophysiological basis for neuronal representations that allow awareness of visual scenes.
Methods:Electrophysiological recording has revolutionized the understanding of visual perception. We review our studies with multielectrode recordings describing topographically organized visual maps in several cortical areas and electrophysiological recording revealing perceptual completion of borders and filling in of color and texture. In addition, we analyzed the existing literature to upgrade the knowledge about serial hierarchical, parallel processing and remapping on a dynamic network of cortical visual areas.
Results:Visual perception is organized in craniometric coordinates based on retinotopic maps in the layers of the thalamic lateral geniculate nucleus. These retinotopic maps project to the primary visual cortex; after perceptual completion and filing in it becomes a binocular visuotopic map. Perceptual completion allows this integration, which implies the reconstruction of form perception from partial contour information. The V1 visuotopic map, in addition to reconstructing partial contour information, creates a stereoscopic map based on the disparity of monocular information. This stereoscopic map, which is seen with both eyes open, is different from each of the monocular maps, although there is always one eye that is always dominant for the location of a target. This neuronal representation is perceptually stable regardless of eye movements. Thus, keeping the head in position and scanning the scene with the eyes can reconstruct a scenario with high resolution, using the foveal region of the retina to build the scene. Remapping and efferent copies of the eye movements make this scenario stable and entirely in color. Moving head, we generate craniumcentric maps, which are perceptually stable regardless of eye movements. During eye movement, two phenomena occur: one of perceptual inhibition during saccadic eye movement, which suppresses perception of retinal image slipping and another of remapping that makes the oculocentric references remapped on the craniumcentric map. The ambient map is a conscious reconstruction of the scene with optimizations of resolution, color, and contrast across the entire field of view. For each position of the head, the oculomotor system scans the scene with the eyes using foveal vision to construct a high-resolution color scenario that generates an optimized visual representation. Thus, a large scenario is “reconstructed” piece by piece.
Conclusions:Visuotopic, craniumcentric, and ambient representations are constructed on a dynamic topographically organized network composed of cortical visual areas.
Keywords: Visual Topography; Primates; Visual System; Visual Representations; Remapping; Perception
List of abbreviations:V1: Primary visual area; striate cortex; V2: Second Visual Area; AIP: Anterior Intraparietal Area; CIP-1: Caudal Intraparietal Area 1; CIP-2: Caudal Intraparietal Area 2; DI: Dorsointermediate Area; DM: Dorsomedial Area; FST: Visual Area At The Fundus Of The Superior Temporal Sulcus; LIPd: Dorsal Portion Of Lateral Intraparietal Area; LIPv: Ventral Portion Of Lateral Intraparietal Area; MIP: Medial Intraparietal Area; MST: Medial Superior Temporal Area; MT: Visual Area MT; MTp: Peripheral Portion of MT; PIP: Posterior Intraparietal Area; PO: Parieto-Occipital Area; TEa: Anterior Portion of Area TE; TEm: Medial Portion of Area TE; TEO: Posterior Inferior Temporal Cortex; TEp: Posterior Portion of Area TE; TH: Cytoarchitectonic Area TH; V3A: Visual Complex V3: Part A; V3d: Dorsal Portion of Visual Area 3; V3v: Ventral Portion of Visual Area 3; V4: Visual Area 4; V4t: V4 Transition Zone; VIP: Ventral Intraparietal Area; VTF: Visual Portion of Parahippocampal TF; amt: Anterior Middle Temporal Sulcus; ar: Arcuate Sulcus; ca: Calcarine Fissure; ce: Central Sulcus; ci: Cingulate Sulcus; co: Collateral Sulcus; dp: Dorsal Pre-Lunate Area; ec: External Calcarine Sulcus; io: Inferior Occipital Sulcus; ip: Intraparietal Sulcus; la: Lateral Sulcus; lu: Lunate Sulcus; ot: Occipitotemporal Sulcus; p: Principal Sulcus; pmt: Posterior Middle Temporal Sulcus; pom: Medial Parieto-Occipital Sulcus; rh: Rhinal Sulcus; sp: Subparietal Sulcus; st: Superior Temporal Sulcus
After 300 ms of opening our eyes, we have vision. Vision is the ability to interpret the surrounding ambient environment using light reflected from the objects. The resulting perception is also known as eyesight or visual perception. Vision is a general experience used for reading, hand and body coordination and to navigate or orient in the ambient environment. Reading or reading comprehension is the ability to process text, understand its meaning, and integrate with what the reader already knows. It depends on visual acuity. This attribute, also referred to as clarity of vision, is dependent on optical and neural factors, such as the sharpness of the retinal focus within the eye, the health and functioning of the retina, and the ability of the interpretative faculty of the brain. Orienting in space or spatial orientation refers to the ability to identify the position or direction of objects or points in space [1]. It can be assessed by asking patients to perform spatial transformations such as rotations or inversions of stimuli. In general, the position and orientation in space of a body are defined as the position and orientation relative to a reference frame, which is fixed relative to the body. There are several systems in the brain that participate in vision.
The main and more important system is the retina-geniculate-striate responsible for visual acuity, depth and color perception, among others. The cortical target from this system is the striate cortex, also known as the primary visual cortex or V1. The main evidence that we have more than one system comes from anatomy and from the ability known as blindsight. In the blindsight condition, functionally blind individuals, who are cortically blind due to lesions in their striate cortex, react to stimuli they are not consciously aware of. There is compelling evidence that blindsight occurs because visual information is conveyed through other routes bypassing the primary visual cortex [2].
Lesions in the retina, either central or peripheral, differentially affect the different attributes of vision. Macular degeneration causes loss in central vision and creates an inability to read but preserves good orientation in space. This degeneration, also known as age-related macular degeneration, is a medical condition that results in blurred or no vision in the center of the visual field. In dry macular degeneration, the fovea of the retina deteriorates but has a minor effect on acuity. With wet macular degeneration, leaky blood vessels grow under the retina and can cause the detachment of the photoreceptor from the choroid. The choroid is the vascular layer of the eye, containing connective tissues, and lying between the retina and the sclera. The choroid provides oxygen and nourishment to the outer layers of the retina. The structure of the choroid is generally divided into four layers. Neovascular macular degeneration is the result of abnormal blood vessels, which grow under the retina, bleeding and leaking fluid. Vascular endothelial growth factor in endothelial cells stimulates angiogenesis and causes wet macular degeneration.
This central retinal lesion is different from that of diabetic retinopathy, which is a more diffuse abnormality in which the human retina is affected due to an increasing amount of insulin in blood. The early signs of this retinopathy appear on the surface of the retina as microaneurysms, hemorrhages, and exudates. This retinal lesion is different from that of glaucoma, where the nerve connecting the eye to the brain is damaged, usually due to high eye pressure. The most common type of glaucoma (open-angle glaucoma) often has no symptoms other than slow vision loss. Angle-closure glaucoma, although rare, is a medical emergency, and its symptoms include eye pain with nausea and sudden visual disturbance.
In this review, we discuss the anatomy of the visual system, the different systems and the effect of perceptual completion and filling in on the retina-geniculate-striate system. We focus on visual representation in the brain and its effect on vision. We review the visual topography of every stage of visual processing and describe the visual representations as retinotopic, visuotopic, craniumtopic or ambient maps. We explain the differences in this representation and its effect on central and peripheral lesions.
Innumerous works in anatomy, neurophysiology and brain imaging have challenged the understanding of visual processing in the brain. Different works have shed light on the understanding of visual perception, each emphasizing different attributes of vision. The early woks of Daniel & Whitteridge [3] emphasize the organization of the visual topography in the primary visual cortex and its relation to visual acuity. They showed that the magnification factor of the central vision was much higher than that of the visual periphery and suggested that this organization would account for the difference in visual acuity.
Parallel pathways of visual processing begin within the retina with different classes of ganglion cells (P and M) projecting to different subcortical structures, which in turn project differentially to cortical areas. The ganglion cells project to different subdivisions of the dorsal lateral geniculate nucleus, which in turn project to different regions of layer IVc in V1. Different compartments of V1 project to different CytOx-rich and CytOx-poor stripes in V2 or to MT (V5).
We have been working with the notion that most visual processing is performed in cortical modules. The first discovery of cortical modules was performed in the somatosensory system area 2 by Vernon Benjamin Mountcastle [4]. Ascending circuits and intrinsic circuits build cortical modules to decode specific attributes of the sensory system. In the visual system, orientation modules were described first by David Hubel and Torsten Wiesel [5]. They also proposed a hierarchical model for visual processing. This model implies that the concentric receptive field of the dLGN assembles orientation decoders or orientation selective cells arranged in columns to build in simple cells in V1, which would mount complex and hypercomplex cells. They imply that these columns would form edge detectors used in higher areas, such as the inferior temporal cortex, to assemble objects or form detectors, such as cells selective to faces (grandmother cells) described by Gross and collaborators [6]. Hierarchical theory is the basis for serial processing in the visual system [5].
The discovery of several areas with topographically organized maps [3,7-11], with modules selective to different attributes of the visual stimuli, such as motion or color [12], creates the basis for parallel processing in the cortical visual areas. Frequency limitations of neural processing have suggested that parallel processing is responsible for the efficient detection of an image. Several areas work simultaneously, in parallel, to allow fast processing of the visual scene (Figure 1D). The very nature of the neural signals and connections, with action potentials with durations on average of more than 1 ms, limits the propagation of information to less than 1 kHz. The interaction of cortical modules or synchronization between neurons is limited in a band of 1 to 300 hz. Nonetheless, when we open our eyes, we build a stable perception in approximately 300 ms. Thus, parallel processing is a crucial mechanism to obtain this performance. The role of feedback circuits intrinsic or in a wider network is evaluated in this review
The segregation and organization of the central portion of the cortical areas was proposed by Zeki. For this author, different attributes of the image were analyzed by different areas [12-14] and serve the basis for simultaneous parallel processing.
Processing of different attributes of the scene to accomplish parallel processing: V1 (orientation columns: orientation selectivity, main attribute for form perception, perceptual completion, an attribute for the representation of an object), V2 (retinal disparity, attribute for 3D) [15], V4 (color selectivity, attribute for color vision), MT (axis of movement columns, [16] attribute for perception of motion). In macaques, visual area V2 is the earliest site in the visual processing hierarchy for which neurons selective for relative disparity have been observed [15,17]. By combining optical imaging, single unit electrophysiology and cytochrome oxidase (CO) histology, Ts´o and collaborators [18] revealed in greater detail the functional organization within the CO stripes of visual area V2 of primates.
We also propose that feedforward and feedback connections play an important role in determining the activity of each module in a wider network (Figure 1D). For example, the activity of a locus in V1 may depend on the activity of several loci of extrastriate areas located anteriorly. Figure 1D shows a topographically organized network representing one point on the upper visual field, composed of areas at subcortical and cortical regions, starting from different types of ganglion cells at the retina [19]. Axons from ganglion cells of the eye project to the superior colliculus (SC) and the dorsal lateral geniculate nucleus (dLGN). Cells of the dLGN project mainly to the primary visual cortex (V1), while cells from the SC project to the pulvinar, which in turn projects to several cortical areas. Retinal ganglion cells from the retina project topographically to visual areas and form an extensive topographically organized network of feedforward and feedback connections.
Ungerleider and Mishkin [20] proposed the concept of visual information processing streams. They defined a ventral and a dorsal stream, the first related to object recognition and the second related to motion processing. With the description of visual area PO, it was suggested that the dorsal stream would be subdivided into a dorsal medial and a dorsal lateral stream [19,21]. The new dorsal medial stream would be related to locomotion processing [22]. The concept of visual information processing evolved in non-human primates toward three streams, having the primary relay area, receiving direct projections from striate cortex [12,23,24]. We extended this concept to four streams of information processing in humans, as illustrated in Figure 1C. The different streams receive most of the connection from discrete portions of V1. The ventral steam receives central projections to 30o of V1, the ventral lateral stream would be related to reading and receives projections from the central 4-5o, the dorsal lateral steam receives projections from the central 60o, and the ventral medial steam receives projections from the peripheral field 8-90o
After reviewing the connectivity of the cortical visual areas, Felleman and Van Essen [25] proposed the concept of a wide network of connections of the visual areas, suggesting that all areas were virtually connected to each other. Nonetheless, our current concept for the perception of an aware and conscious scene representation depends on the activity of a dynamic network (Figure 1D) of areas of the occipital, temporal and parietal cortices, with feedforward and feedback connections. This hypothesis implies that the selectivity of individual cells or the activity in one (or more) visual information processing in the network can be accessed and contributes to the visual representation at the conscious level.
Figure 1A summarizes the location and extent of the visual areas in primates [21,24,27-29]. These areas were defined by visual topography and cortical connections. The largest area is area V1, surrounded by area V2 with a second-order representation [7] to optimize the length of cortical connections between V1 and V2. Visual area V3 surrounds V2 and is posterior to area V4. In the superior temporal sulcus (STS), we found area MT, first described in the macaque as a movement area by Zeki [12]. Area MT is surrounded by V4t [30]. Area TEO is anterior to V4 and is a transitional area between the occipital and temporal lobe. It is the last area in the direction of the temporal lobe that has a crude topographic organization [31]. Although no one has yet carried out an electrophysiological mapping of the areas in the intraparietal sulcus (IPS),CIP-1, CIP-2, VIP, MIP, PIP or LIP, the connectional results suggest at least an upper vs. lower field segregation of these areas. Segregated clusters of labeled neurons in the IPS, following retrograde tracer injections in PO, allow us to draw several conclusions. First, they indicate that areas CIP-1, CIP-2, LIP, VIP, PIP and MIP all exhibit some degree of topographic organization. Areas PIP, MIP and PCi represent a portion of the visual field that seems similar to the one represented by area PO, which emphasizes the visual periphery. VIP and LIP, in particular, are organized such that the superior and inferior hemi-fields are represented in their anterior and posterior portions, respectively. Second, areas CIP-1 and CIP-2 predominantly represent the superior visual hemifield. Third, labeled neurons in VIP, LIPv and LIPd produce noncontiguous sets of clusters corresponding to the same PO injections. This result supports VIP as an independent area and supports the partitioning of LIP into ventral and dorsal portions. Fourth, PO projections support V3d and V3v as a single and unique visual area. V3d is distinct from V2d, which contains an independent set of topographically organized labeled clusters. V3d is also distinct from V3A (DI), which contains a representation of the upper and lower visual fields. We also observed a distinct area, namely DM, located anteriorly to V3d and medially to V3A. This partitioning allows V3d and DM to retain a representation of the lower visual field, as both areas share the same clusters of labeled neurons. Similarly, it also allows V3A (DI) to retain representations of the lower and upper visual hemifield representations. Previous reports have long suggested the existence of areas DM and DI in the primate visual cortex [33]. By injecting retrograde tracers into the MT, we were able to distinguish isolated clusters of labeled neurons in regions matching the locations of areas DM and DI [32]. Furthermore, immune architectural data based on Cat-301 immunohistochemistry corroborate these conclusions [29].
Tracer injections in V4show that PIP, VIP and LIP receive projections predominately from the peripheral representation of the visual field. Additionally, V2 projections reaching VIP are mainly those of the far-periphery representation of the visual field. The arrangement of the labeled clusters corroborates the notion that the projections to the IPS areas have some degree of topographical organization. Within the IPS and adjoining regions, only PO, V3A, VIP and LIP receive projections from V2, while PO, PIP, DI, V3A, DP, LIP and VIP receive projections from V4. V3A (DI), DP, PIP, MIP, POd, CIP, LIP, VIP, and area 5d project to PO. Area POm at the medial surface of the hemisphere (as described by 24 - Colby et al., 1988 and 32 - Rosa et al., 1993) also projects topographically to area PO.
We studied the visuotopic organization (Figure 1B) of visual areas: V1, V2, V3, V4, MT, PO and POd [8-10,19,34-36]. Today, we view the organization of the cortical visual maps in another context based on retinotopic organizations [3,7,33,37]. This new concept introduces the notion of a visuotopic organization, with partial reconstruction of the cortical representation based on implicit visual cues. This concept is illustrated at the initial stages of visual processing, as shown in Figure 1B. This figure shows the representation of a visual scene projected at multiple levels of the visual system. The scene is divided into two parts limited by a vertical meridian that passes at the point of fixation, one for each hemisphere. The anatomical organization of the optic nerve, optic chiasma and optic tract produces a contralateral representation of the visual hemifield at the dLGN, superior colliculus and primary visual cortex. The contralateral nasal retina and the ipsilateral temporal retina project to the ipsilateral dLGN, while the other portions of the retina project to the contralateral dLGN. In addition, the projection to the dLGN and to the primary visual cortex follows the density of photoreceptors of each part of the retina and creates a neural representation of the visual scene with large magnification of the central field (Figure 2).
Both the right eye and the left eye receive the projection of the scene with the limitations of the blood vessels and of the blind spot present in the retina. In each eye, the representation of the blind spot is located in a different place in the visual field, so when the two projections are merged, there is a predominance of uncovered retinal information, which produces a representation of the scene without blind spots in the cortical projection. The location of the blind spot can be seen as a gap in the appropriate sagittal sections of the dorsal lateral geniculate nucleus (dLGN), where the contralateral layers of both the parvocellular and magnocellular layers are interrupted (Figure 2, right insert). The retinotopic representation of the contralateral visual hemifield is projected at different layers of the dLGN stacked together to allow point-to-point interaction at different layers [38,39]. In the primary visual area, the representation of the scene is reconstructed with large magnification of the central vision. At this level, local and feedback circuits contribute to the reconstruction of the visuotopic map. There is a contrast in electrophysiological recordings for areas with visuotopic organization and areas with cells responding to specific properties of the visual stimuli and areas in the temporal lobe, with proprieties related to form perception [6]. There is no evidence for visuotopic organization, and the stimulus selectivity of any given IT neuron is similar throughout its large receptive field, showing stimulus equivalence across retinal translation. The inferotemporal neurons respond better to complex objects than to slits of light or edges. A few IT units appear selective for specific objects such as hands or faces. These properties of IT neurons indicate that they play a role in analyzing the global aspects of a complex object, such as its shape, rather than in analyzing local features, such as the retinal locus of edges and borders [6].
One example of selectivity to a global aspect, such as a monkey face, is shown in Figure 3, with the response of an IT neuron to monkey faces, human face and human hand. The response of the cell to the presentation of monkey and human color slides is compared to each other and to images with some of the components removed or rearranged. The two different monkey faces tested elicited strong responses that were reduced significantly but were not eliminated by removing the eyes, the snout, or the color. Thus, one component alone (such as eyes, snout, and color) was sufficient to explain the response to the original face. Scrambling the internal features of the face greatly reduced the response, indicating that a particular configuration of the internal features was essential. This is confirmed when the hand elicited almost no response [40].
In the eye, blood vessels covering the retina obstruct portions of the visual scene from stimulating photoreceptors at the fundus of the eye. Similarly, the optic disk, a region of the retina devoid of photoreceptors, generates a visual scotoma known as the blind spot (BS). In addition, the density of photoreceptors varies with eccentricity. Such anisotropies and discontinuities in the visual field are not usually perceived with monocular vision due to rapid bottom-up visual representation reconstruction, a phenomenon known as perceptual completion [41]. Bottom-up visual processing is considered the result of ascending projections, with intrinsic or horizontal connections, acting through a series of hierarchical visual areas [5,42]. The intrinsic lateral connections are the anatomical substrate for RF surround properties and are responsible for perceptual completion in the primary visual cortex, V1 [41,43,44]. In V1, information from both eyes arrives segregated into layer 4C.
In the capuchin monkey, we studied visual area V1 to address the local organization looking for anisotropies at the local representation. We concluded that there was a small anisotropy related to the orientation of the ocular dominance columns and that the shape of V1 was that the 3D isoeccentric curves were similar to those described in the macaque by Daniel & Whitteridge [3]. We also studied the organization, location and visual topography of area V2, specifically addressing the anisotropies related to the orientation of the CytOx bands of V2 [45]. V2 has a second-order map of the visual field, with a split at the horizontal meridian. The visual topography of V2 is coarser than that of V1. In V2, receptive fields corresponding to recording sites separated by a cortical distance of up to 4 mm may represent the same portion of the visual field. Quantitative analysis showed that the representation of the central visual field is magnified relative to that of the periphery. The cortical magnification factor is greater along the isopolar dimension than along the isoeccentric dimension. We studied the location and organization of area MT, also referred to as V5 [30]. MT has a first order map of the visual field similar to that of V1. We also studied the organization of area PO and POd in the capuchin monkey [22]. PO is located in the ventral aspect of the anterior bank of the parietooccipital sulcus and ventral precuneate gyrus. Different from the previously described visual areas, PO and POd are organized in the isopolar domain. That is, it has a precise organization of isopolar lines and a complex organization of isoeccentric lines. A large and complex representation of the periphery, from 20oto 60o eccentricity, is present at the lateral and medial portions of these areas. By contrast, the representation of the central 20o is very small in both PO and POd. The visual maps of PO and POd have discontinuities and re-representations of portions of the visual field that appear to be different in nature from those previously observed in the capuchin monkey in areas V2, MT and V4 [30,45,46]. The receptive field size in PO and POd is invariant with eccentricity. The visual maps of PO and POd may be related to a specific arrangement of centrifugal or centripetal functional modules that could be related to the processing of flow-field movement of objects. The unique organization of these areas with two congruent albeit complex maps suggest the possibility of cooperative interactions between these areas. In addition, the maps in PO and POd have a greater emphasis on peripheral representation than those of other prestriate areas, such as V2 [9,45], V3 [10], V4 [10,46], MT [8,30,47,48] and TEO [31].
Figure 1 shows the location (Figure 1A) and visuotopic organization (Figure 1B) of the cortical visual areas shown on a flat reconstruction of a monkey cortex. In this figure, two visual areas have emerged rather undisputed: the first (V1) and second (V2) visual areas, each exhibiting a precise topographic representation of the contralateral visual field [3,7,9,11,27,45]. In addition, area MT, situated further anteriorly, has a well-established location, extent and topographic organization in both Old and New World monkeys [8,30,49,50]. Anterior to V2, we showed one area, namely, the third (V3) and parieto-occipital (PO) areas [10,24]. Based on the topography and anatomical connection of V2 [27], we propose the existence of an area V3 that includes both ventral (V3v, upper contralateral quadrant representation) and dorsal (V3d, lower contralateral quadrant representation) components in Old World macaques and New World capuchin monkeys. In addition, injections throughout the extent of V4 show a similar pattern of feedback projections to the same region anterior to V2 [28]. As illustrated in Figure 1B, in our proposal area, PO borders V2 and the medial portion of area V3d, such that V2 shares the horizontal meridian (HM) with both V3d and area PO. Based on studies in the macaque [10,24,28], area PO (also referred to as V6) is a distinct cortical area. Notably, we showed that the projections from areas V2 and V4 to area PO concern mainly the peripheral (>20o) representation of the visual field, although some receptive fields mapped in area PO are large enough to encompass the foveal and parafoveal representations [22].
Perceptual completion is a phenomenon in which contour and shape are perceived even though these features are not physically present in the retina. In the human retina, there is a large region naturally devoid of photoreceptors called the blind spot. It corresponds to the head of the optic nerve. It has an elliptical shape of approximately 7.5o in height and 5.0o in width, located approximately 15o from the fovea. This discontinuity in the receptive surface, under normal circumstances, is not accompanied by abnormal perception, even in monocular conditions. Anatomical data show that the sensorial gap in the retina due to the blind spot is present in the dorsal lateral geniculate nucleus [38,39] and is still present in the afferent layer (4c) of the primary visual cortex (V1) in non-human primates and humans [51,52]. Functionally, cells located inside a given eye-projecting column have a tendency to give better responses to that eye but usually respond to the other eye [53]. In tangent CytOx preparations of V1, the blind spot representation is easily identifiable as a large area without direct retina-geniculate input from the contralateral eye [33,51]. Fiorani and his collaborators have shown that neurons within the cortical representation of the optic disk in V1 interpolate receptive field position for the contralateral eye based on the extension of the stimuli beyond the boundaries of the blind spot [30,54]. In addition, they showed that the ability to interpolate receptive field position across large distances is present in neurons in other portions of V1 as well [55].
Fiorani and collaborators [30] showed post stimulus histograms under different mask sizes to different orientations and axes of movement for a neuron located in the lower field representation of the visual field, outside the representation of the optic disk. Previous studies have already shown selective inhibitory modulations from a wider area beyond the classic receptive field [56-58].
In conclusion, by masking the classical receptive field, Fiorani et al. [30] demonstrated that other previously unresponsive regions can be made to excite neurons in V1. Thus, the extent of the area that can drive a given cell in V1 is several times larger than the classic receptive field. Several factors contribute to the response of the active surrounds. Coherent motion and collinearity of the stimuli were important elements for the interpolated responses or the completion effect. Collinearity was important but not necessary.
The large size of the active surrounds and the restricted arborization of geniculate-striate fibers make it unlikely that this pathway is the only pathway responsible for the generation of interpolated responses. Cells in the lateral geniculate nucleus do not show the completion effect [59]. Intrinsic connections in V1 and “backwards” extrastriate projections may both have the necessary coverage for spanning the gaps in the image. Thus, it would be plausible that feedback projections to V1 could play an important role in completion. Injections of fluorescent tracers in V1 at different topographical locations revealed labeled cells in several areas anterior to V1. Central injections in V1 labeled neurons in V2, V3, V4, V4t and MT, while peripheral injections labeled areas PO and MST [60]. Thus, the completion-like phenomenon described for a single cell in V1 can be either due to the existence of active surrounds that could be generated in local microcircuits in V1 or be the product of a more distributed network involving areas located anteriorly to V1.
Interpolated receptive fields and discontinuous receptive fields behave analogously to the perceptual completion phenomenon that “fills in” the image across naturally blind regions of the retina. The requirements for completion at the cellular level seem similar to those that elicit completion in psychophysical studies [61,62].
In studies with anomalous contour stimuli, Peterhans and von der Heydt emphasized that neurons in V2 were selective for the orientation of contrast borders as well as for anomalous contours [63-65] and lines defined by collinear dots [66]. The stimulus configuration used in the study of the “artificial blind spot” conforms to the condition of amodal completion defined by Kanizsa [67].
We also studied neural activity in different visual areas in wake behaving monkeys while fixating a dynamic background. Using a patch located in the receptive field, we found that cells in V3 after 6-8 s presented activity compatible with filling in [68]. Filling-in is a perceptual phenomenon in which visual features such as color, brightness, texture or motion are perceived in a region of the visual field even though such an attribute exists only in the surround.
Filling-in is observed in various situations and is an essential part of our normal surface perception. Psychophysical experiments suggest that some active processes are involved in the occurrence of filling-in and that some neural computation occurs in the brain when filling-in occurs. The time course of these dynamic changes in activity parallels the time course of perceived filling-in of the hole by human observers, suggesting that this process mediates perceptual filling-in [68].
Figure 1C shows four streams of visual information processing related to different aspects of visual perception. This figure emphasizes the extent of the visual field of V1 projecting to these streams. Most, if not all, streams receive direct projection from V1 to its initial target area. The direct projection from V1 to the Wernicke’s area or angular gyrus in humans is still an issue for studies. We have no direct evidence in humans of the projection of V1 to this area, however the temporal association cortex is considered a primate specialization and is involved in complex behaviors, with some particularly characteristic of humans, such as language. The emergence of these behaviors has been linked to major differences in temporal lobe white matter across several anthropoid primates [69]. These differences have a parallel with the differences of in the white matter bundles leaving the occipital pole, the posterior arcuate fasciculus and the vertical occipital fasciculus [70].
The study of non-human primates confirmed direct projection from V1 to V2, MT and PO [12,23,24,71]. The visual topography of V1 on humans is shown at left in this figure point to a geometric decay of the magnification factor with the isoeccentricity lines equally distributed for 1, 2, 4, 8 16, 32 and 64 degrees. The main extent of the visual field of V1 that project to areas of these streams is shown in the middle of the figure. Most of the available data from functional MRI focus on the ventral stream of information processing as described by Ungerleider and Mishkin [20]. It is composed of V1, V2, V3, V4 and TEO projecting to several areas in the temporal lobe. In this stream, there are descriptions of areas responsive to faces, hands and houses located at the temporal lobe [72-74]. The fusiform face area in the temporal lobe is specialized for expert object recognition [75]. We probably spend more time looking at faces than at any other object. We, therefore, associate this stream to object discrimination.
When humans read written words while they are in a magnet, they activate the opercular and central representation of V1 in addition to areas in the left occipital gyrus, while listening a word activates the auditory areas and areas in the temporal lobe known as the Wernicke area in the superior temporal gyrus [76]. High span readers showed more activation in the left angular gyrus [76]. These results corroborate previous studies of listening and reading comprehension [77-79]. Activation of the central striate cortex was also observed by Bavelier et al. [80]. This stream we named here the ventral lateral or ventrolateral stream. It could be considered as the cognitive stream. This stream is used for reading, and most of the acuity tests used in ophthalmological practice probe its proprieties. Patients with macular degeneration lose the very central region used for reading. The foveal region in primates extends for approximately 5 degrees. The largest letter generally used in the acuity test (letter E) encompasses approximately 20 minutes of arc. Thus, in general, the tests used in clinical ophthalmology test this stream and not the dorsolateral stream that underlies visuomotor coordination.
The existence of a ventral lateral stream in non-human primates was never considered [19,20]. However, behavioral data from patients with macular degeneration disease points to the existence of a new stream of visual information processing dealing with language. Patients with large macular degeneration are unable to read with their peripheral vision.
Eichert et al. [81] investigated the extent to which between-species alignment, based on cortical myelin, could predict changes in connectivity patterns across macaque, chimpanzee, and human. They knew that evolutionary adaptations of temporo-parietal cortex are considered to be a critical specialization of the human brain. They addressed specifically how did language evolve in primates. Since the human lineage diverged from that of the other great apes millions of years ago, changes in the brain have given rise to behaviors that are unique to humans, such as language. Some of these changes involved alterations in the size and relative positions of brain areas, while others required changes in the connections between those regions. Thus, Eichert et al. [81] showed that this difference cannot be explained solely by changes in the positions of brain regions. Instead, the arcuate fasciculus underwent additional changes in its course, which may have contributed to the evolution of language. They showed that expansion and relocation of brain areas can predict terminations of several white matter tracts in temporo-parietal cortex, including the middle and superior longitudinal fasciculus, but not the arcuate fasciculus.
Most of the visual field representation, including the binocular representation of the visual field, projects to the dorsolateral stream of visual information processing, which includes areas MT, MST [82], areas in the intraparietal sulcus and parietal areas. These areas interact with sensory-motor areas and are responsible for the perception of movement of object and visuomotor coordination [83]. This stream is used to aid most body movement and for the ability to drive a motor vehicle. We consider the use of acuity tests to renew a driver license that is inappropriate. Visuomotor tests at the binocular region are indeed more appropriate.
In the dorsomedial stream, the peripheral field of V1 projects to areas PO, POd, areas in the intraparietal cortex and areas in the parietal lobe. These areas are organized in the isopolar domain and are probably suited to process centrifugal and centripetal movement of objects. These areas project to areas of the intraparietal sulcus and to areas of the parietal lobe [24,29].
A retinotopic map refers to the orderly mapping of the receptive field position in retinotopic coordinates in the brain. A retinotopic map implies the existence of a neuronal representation organized in retinotopic coordinates. The significance of visual maps and brain maps originates with the 19th century debate concerning localization of function in the brain. Evidence for the existence of retinotopic maps, and by implication, localization of function, in the visual cortex came from analyses of visual field scotomas resulting from partial injuries to the visual cortex caused by bullet wounds sustained by soldiers in different wars. These studies showed a predictable relationship between the region of damage in the striate cortex and the location of the area of blindness in the visual field [84-86].
After studying the visual topography of V1, we are certain that we do not have a retinotopic map in the neocortex. The map of V1 and those of the other extrastriate areas are visuotopic; that is, they reconstruct the image representation based on predictable cues. If you view a newspaper page printed with many imperceptions, we automatically reconstruct the text or the imperfection (partially interrupted letter fonts) based on local circuits or feedback connections to V1. Thus, the representation of the image in the primary visual cortex is visuotopic and not retinotopic.
Figure 4 summarizes the different maps perceived in the normal condition. A retinotopic map only exists in the layers of the dLGN. In this nucleus, the magnification factor is different for the parvocellular and magnocellular layers. Due to this difference, we represented the images of the scene in Figure 4 without the magnification of the central vision.
The visual perception is three-dimensional. It presents a number of proprieties described as perceptual completion in V1, filling in in V3, stereoscopic responses (due to retinal disparity) in V2, and a color representation due the processing in V4. The visual representations in the neocortex are based on the extensive parallel, serial and feedback circuit connection. It is stabilized due to feedback of the efferent copy of the control of the extraocular muscles. Remapping and perceptual inhibition are characteristics of the image representation in the neocortex. If we keep our head in the same position, the representation of the image in the neocortex is stable, despite occasional eye movements. However, if one moves the head or the position of the skull in the ambient environment, the neural representation of the scene changes, and a new perspective of the image replaces the original perspective. There is no evidence for efferent copies of the medulla that command the neck muscles to the neocortex, and in this case, we have a craniumcentric perception of the scene. The neuronal representation of the scene is also modified when we move forward or backward in the ambient environment. Centrifugal and centripetal movement of objects in the ambient environment studied with an immersive bubble process generates maps used for visual motor coordination. It generates a visual representation of the ambient based on an egocentric map.
The summary in Figure 4 suggests that retinotopic maps in the LGN generate visuotopic maps with the aid of perceptual completion, stereopsis, filling in, color processing and visual remapping due to eye movements. These visuotopic representations generate craniumtopic representations and finally an ambient representation of the visual scene.
When we move our eyes but maintain our head fixed, visual perception is stable (Figure 5A,B,C and D). Thus, the visual representation in high visual areas should also be stable. However, if we move our head, the perception changes, and a new perspective of the visual scene emerges. This difference is related to the nature of the integration of the areas controlling the eye movements and the areas controlling the head movements and the cortical visual areas. Motor nuclei from extrinsic eye muscles are integrated into areas of the intraparietal cortex, while areas of the medulla controlling neck muscles do not reach the neocortex. Spindles or movement receptors are well integrated into the cerebellum and are responsible for harmonious integration of the muscles, resulting in precise control of the movement of the head. Thus, stabilization by feedback of efferent copy of the eye movement generate a stable representation of the visual scene (Figure 5E), while the perception of the same scene during an extrinsic movement of the eye by tapping externally one eye ball with your finger cause a destabilized (fuzzy) perception (Figure 5F).
The areas of the intraparietal sulcus (IPS) are organized in such a way that more perceptual related visual areas are located posteriorly, areas related to eye movements and movement in the near extrapersonal space are located at the central portion of the sulcus, and areas related to sensory motor components are located more anteriorly. These latter areas are richly interconnected with the frontal lobe and thereby participate in executive control. Lewis & Van Essen [87], using myelin staining and SMI-32 immunohistochemistry, were able to identify 13 areas (17 architectonic subdivisions) in the IP and PO regions of the macaque. Despite the terminological differences, the arrangement they describe is boldly similar to that of Preuss & Goldman-Rakic [88] and of Pandya and collaborators [89-92].
We showed that the topography of V1 has continuity that offsets topographic retinal discontinuities. The physiological substrate of V1 is sufficient to generate completion that is already present in the perceptual V1 area. These results showed that the interpolated response inside the blind spot representation is organized topographically in a manner similar to that of other regions of V1, emphasizing the functional importance of the neuron orientation/direction circuits in the previously described RF spatial dynamics. Based on these results, we propose that V1 has a continuous functional topographic map that is “visuotopic” rather than a discontinuous “retinotopic” map.
The intermingling of V2 cells projecting to areas V4 and TEO suggests that similar information is relayed to both areas. Although there is still some question about the relative proportions of color- and orientation-selective cells in the CO-rich thin stripes and CO-poor interstripe regions of V2, it is commonly accepted that these two stripe systems, taken together, contain many color- and form-selective cells and some cells that are selective for both [93-96]. The thin strips and interstripe regions lack large numbers of directionally selective cells, which instead are mainly located in the CO-rich thick stripes that project to area MT [94].
Area LIP, in the intraparietal sulcus, is related to areas mediating saccadic eye movements. Its neurons are capable of fast remapping of receptive fields that anticipate the future position of the eyes [97-101]. These neuronal processes require the integration of visual and motor information. This area receives input from several visual areas and is interconnected with the frontal eye field (FEF) and the superior colliculus [102,103]. Projections from V4 (an area known to be involved in selective attention) to LIP are well established [89,92,104]. Goldberg and collaborators [105] suggested that the activity of LIP neurons is not directly involved in generating saccades but rather indicates the focus of attention [101].
Vision captures information via discrete eye fixations, interrupted by saccadic suppression, and limited by retinal inhomogeneity. However, scenes are perceived as coherent, continuous, and meaningful despite eye movements. Intraub [106] and Shioiri et al. [107] proposed a new multisource model for visual scene representation in terms of an egocentric spatial framework that is ‘filled-in’ by visual sensory input, amodal perception, expectations and constraints derived from rapid-scene classification and object-to-context associations. Visual perception is dependent on neocortical representation of visual information. If the information does not reach the neocortex, it can be used for visual motor adjustments, but it is not available as visual perception. We propose that visual areas V1 and V2 that are equivalent in size and magnification may be used to hold a stable visual scene representation. This visual scene representation is an anisotropic reflection of the visual sensory input rebuilt in V1 and V2 via perceptual completion, filling-in, retinal disparity and color propagation to construct a stabilized perception of the visual scene. The relationship of this representation was well defined by Albright [108] that suggests that perception is the consequence of a complex neuronal computation in which contextual information is used to transform incoming signals from a sensory-based to a scene-based representation.
We will address the visual representation at the conscious level as previously reported [109], and we will try to relate them to visual maps in the cortex. Figure 4 shows images of different visual representations in the brain. We show the difference between retinotopic, visuotopic, craniumcentric (or cyclopic) and ambient maps. The ambient or egocentric map enables the relationship between the visual map and the motor map of the individual. It is important to translate the location of the skull-centered map to a location on the map of nearby extracorporeal space. It is also important for correlating visual space with nearby extracorporeal space during ambulation, as in the case of ambulation in an immersive bubble. The processes that enable the integration of these maps integrate perceptual, postural, motor planning and motor information themselves. From a perceptual point of view, the visual system is organized in oculocentric coordinates to initially generate retinotopic maps in the layers of the thalamic lateral geniculate nucleus. These retinotopic maps project into the neocortex, and in the primary visual cortex, they become binocular visuotopic maps. In this process, LGN’s projections for V1 are organized in such a way as to promote perceptual completion allowing the integration that implies the reconstruction of form perception from partial contour information.
The V1 map, in addition to reconstructing partial contour information, creates three-dimensional information linked to the disparity of monocular information. Thus, the retinotopic map of one eye is integrated with the retinotopic map of the other eye in a fixation plane and generates a three-dimensional map for non-concreting or disparate regions of perception. This 3D map, which is viewed with both eyes open, is different from each of the monocular maps, although there is always one eye that is always dominant for the location of a target.
This 3D map is a craniumcentric, skullcentric or cyclopic map, and it is perceptually stable regardless of eye movements. Thus, keeping the head in position and scanning the scene with the eyes can reconstruct a scenario with high resolution by scanning the scene through the eyes, using the foveal region of the retina to build the scene. Thus, this scenario is stable, with high resolution, entirely in color, regardless of eye movements. The high acuity is related to the density of ganglion cells in the retina and its projection to the primary visual cortex, V1. The high magnification factor in the central region was well documented by Daniel and Whitteridge [3] in the early study of visual topography in the cortex. Note that the resolution of the craniocentric representation is reconstructed at the neocortex in a set of areas that constitute a very efficient network to generate a percept with high resolution that spreads toward the visual mid-periphery up to approximately 40°. Thus, we propose that the high resolution and the color information propagate in the network to the representation of the periphery compatible with our conscious perception of the scene for each position of the head. Therefore, the craniumcentric map is perceptually stable regardless of eye movements, and yet, during eye movement, we have two phenomena: one of perceptual inhibition, which makes us not notice retinal image slipping during eye movement, and another of topographic remapping that makes oculocentric references remapped on the craniumcentric map. Eye movement information can come from either an efferent copy of motor information or extraocular muscle spindle position information.
The ambient or egocentric map is a conscious reconstruction of the scene with optimizations of resolution, color, contrast across the entire field of view. For each position of the head, the oculomotor system scans the scene with the eyes, using the foveal region to construct the high-resolution color scenario that is generalized fully of the “idealized” visual field for that target. Thus, a large scenario is “reconstructed” piece by piece to allow the reconstruction of the ambient scenario.
Two visual areas, V1 and V2, contain a complete representation of the visual field with great emphasis on the central field representation. They are likely to be the basis of the main visual representation; they compose a duo that could work together to create a main visual representation in the neocortex (an ambient map). They receive upstream and feedback projections from most of the visual areas. In addition, area V4, with extensive feedforward and feedback cortical and subcortical projections, is likely to be the basis for the craniocentric map. The visuotopic map could be based on the activity of V1 cells, while the retinotopic map would exist at subcortical levels, mainly in the dLGN.
We would like to thank Prof. Mario Fiorani for his helpful insights and suggestions in the manuscript. This study was supported by grants from the Fundação de Apôio Carlos Chagas Filho (FAPERJ E-26/010.001.238/2016, E-26/210.917/2016 - PRONEX), and FINEP (0354/16).
Leslie Ungerleider made fundamental contributions to the understanding of visual perception and of the cortical organization of primates. Her enthusiasm, scientific expertise and rigorous application of anatomical and functional NMR techniques has significantly influenced these fields of research. For us, Leslie will always be missed!
Figure 1: Location (A) visuotopic organization; (B) of the cortical visual areas, visual steams of visual information processing (C) and topographically organized network (D) |
Figure 2: Visual topography at early stages of processing |
Figure 3: Response of an IT neuron to faces and hand |
Figure 4: Neuronal representations |
Figure 5: Perceptual spatial constancy in the presence of eye movements and visual perception with natural and artificial eye movement The scene shown in (A) can be scanned with eye movements, but the perception of the image remains constant, despite the different images represented in the retina (B, C, and D). Stabilization by feedback of efferent copy of the eye movement (E) compared with the perception of the same scene while tapping externally one eye with your finger (F). |