Cortical trade-offs in generalist vs. specialist bias

Neocortex is a network of neurons, loosely organized in a hierarchy of generalization. In this hierarchy, stimuli selectively propagate from primary to association areas and cortices ("Cortex & Mind", Cortical Memory by Joaquin Fuster). Higher areas replace their inputs by increasingly general patterns, with greater spatial and temporal scope of indirectly represented stimuli.

Generalization is just another word for pattern discovery, and basic pattern is a coincidence of multiple inputs. For a neuron, the inputs are presynaptic spikes. Sufficient number of coincident spikes triggers Hebbian learning: “fire together, wire together“ between simultaneously spiking neurons. More precisely, a synapse is strengthened if pre-synaptic neuron fires just before the post-synaptic one.

The choice of coincident inputs becomes exponentially greater with the length of connections. Hence, stronger patterns (greater number and closer timing of coincident input spikes) can be discovered. But that greater range must come at the cost of having fewer total connections, representing lesser detail. Which requires greater selectivity in learning: longer reinforcement to form and strengthen synapses.

Most of neocortex is connections between neurons (dendrites and axons), plus their life support. Given limited resources within a skull, there must be a tradeoff between total number of connections and their average length. In other words, cortex can be relatively dense, with more numerous connections of shorter average length, or sparse, with fewer total connections of greater average length.

So, other things being equal, there must be a tradeoff between speed and detail of learning, and relative  scope and stability of learned patterns. The former is prioritized in a dense hierarchy (my “specialist bias”), and the latter in a sparse hierarchy (my “generalist bias”).

Cellular factors in density vs. range trade-off.

Initial determinant of cortical “density” is the rate of division and survival for neuronal progenitor cells during embryonic and perinatal cortical development. Slower division or faster die-off would leave fewer progenitor cells, which in turn produce fewer cortical neurons. That should leave more space and resources to grow correspondingly longer axons and dendrites between them. This is likely determined by nerve growth factors and their receptors, - higher activity would produce denser network. One such factor may be coded by CATNAP2 gene, expressed mostly in prefrontal and parietal cortices, and probably correlated with autism: Genes for autism or genes for connectivity.

The most recognizable feature of neocortex is its six layers. Deeper and older layers VI and V mostly mediate cortico-subcortical integration, layer IV propagates data flow upward the cortical hierarchy via thalamus, and newer layers II and III provide intra-cortical connectivity, mostly via layer I axons.

Henry Markram recently reported innate ”peak connectivity” of layer V pyramidal cells at 300-500 mu. There are ~50 cell clusters (representation units) interlaced within that distance. It seems to me that these clusters provide minimal representation redundancy and mutual support via reverberating firing. Each cluster probably responds to some specific intensity of stimulus. These clusters inhibit each other within a column: “Sparse distributed coding model…“ to adjust for redundancy within receptive field.
So, variation in the range of such peak connectivity may be one of main factors in dense vs. sparse bias.

A unique feature in human brain (and to a lesser extent in other primates and whales) is spindle cells. Wikipedia: "Spindle cells emerge postnatally and eventually become widely connected with diverse parts of the brain, evidencing their essential contributions to the superior capacity of hominids to focus on difficult problems." Axons of spindle cells are less branched than those of pyramidal neurons, and their extended range must come at the expense of reduced density of other connections. This trade-off probably enables better top-down (general-to-specific) focus in humans.

Another possible factor in the trade-off is the ratio of glia to neurons, which I think is also a good sign of a "sparse" architecture. This is an excerpt from the "The Root of Thought": "As we move up the evolutionary ladder, in a widely researched worm, Caenorhabditis elegans, glia are 16 percent of the nervous system. The fruit fly’s brain has about 20 percent glia. In rodents such as mice and rats, glia make up 60 percent of the nervous system. The nervous system of the chimpanzee has 80 percent glia, with the human at 90 percent. The ratio of glia to neurons increases with our definition of intelligence."

However, his interpretation that glia a main information processing component in human brain is implausible, I agree with mainstream opinion that they mainly provide support for neurons.
Greater proportion of glia would reduce density of neurons, but enable higher activity and longer-range connections for each of them. Again, that means a sparser network.

I might be wrong on much of the specifics (above and below), but that wouldn't affect the "sparse vs. dense" premise itself. This premise depends only on the assumption that neocortex is a hierarchy of generalization for the patterns of stimuli, and I can't think of any alternative to that. The trade-off may differ among cortical areas, but I think specificity of such differences is relatively low. That’s because the number of progenitor cells in the cortical sub-plate is set before any significant differentiation between the areas. Also, genetic variation among individuals is very minor, and whatever differences develop through postnatal learning are already affected by innate biases.

Differences between higher and lower cortical regions

Joaquin Fuster traced the differences between primary & hierarchically higher association areas in Cortex & Mind, p. 73: “At the lower level, representation is highly concrete and localized, and thus highly vulnerable. Local damage leads to well-delimited sensory deficit. In unimodal association cortex, representation is more categorical and more distributed, in networks that span relatively large sections of that cortex… In transmodal areas representation is even more widely distributed… P. 82: “Thus a higher-level cognit (e.g., an abstract concept) would be represented in a wide network of association cortex…” In my terms, wider networks imply “sparse bias” on higher levels of generalization.

Similar quatation from “How to Create a Mind” by Ray Kurzweil, p. 86: A study of 25 visual and multi-modal cortical areas by Daniel Felleman found that “As they went up the neocortical hierarchy,.. processing of  patterns comprised larger spatial areas and involved longer time periods“.
Another study by Uri Hasson stated: “It is well established that neurons along the visual cortical pathways have increasingly larger spatial receptive field.” and found that “similar to cortical hierarchy of spatial receptive fields, there is a hierarchy of progressively longer temporal receptive windows”. 

It’s also suggestive that parietal cortex seems to have higher-frequency brain waves than prefrontal one: Buschman & Miller of MIT,
ref:have found two types of attention in two separate regions of the brain. The prefrontal cortex is in charge of willful concentration; if you are studying for a test or writing a novel, the impetus and the orders come from there. But if there is a sudden, riveting event—the attack of a tiger or the scream of a child—it is the parietal cortex that is activated. The MIT scientists have learned that the two brain regions sustain concentration when the neurons emit pulses of electricity at specific rates—faster frequencies for the automatic processing of the parietal cortex, slower frequencies for the deliberate, intentional work of the prefrontal." I think higher-frequency waves are associated with reaction speed and detail, and lower frequency with a longer feedback loop of higher levels.

The neocortex is myelinated sequentially from primary to association areas at correspondingly increasing age (up to ~30 year old for prefrontal cortex), & myelination then seems to decline in reversed order ("Human Neurophysiology", page 197). Allowing for a multi-year delay in knowledge accumulation, this probably reflects &/or determines the age at which abilities peak in fields that require knowledge of corresponding generality. It's known that athletic abilities (primary cortices) peak in early 20s, and mathematical skills (likely parietal cortex) a bit latter.

On the other hand, performance in business, politics, social sciences, and literature (anterior prefrontal cortex?) doesn't peak until late in life. This is probably even more true in philosophy, but performance metrics there are questionable. Also supportive is the observation that this cortical development sequence is delayed by several years in subjects with ADHD. Obviously, effective generality of discovered concepts, thus also development of higher assiciation areas, depends on attention span.

The differences among cortical regions also include cortical hemispheric asymmetry. It appears that left hemisphere represents higher-generality concepts, especially semantic ones. The right hemisphere works mostly in the background, likely searching for lower-level contextual patterns (Cortex & Mind, p. 184, Split Brain, Gazzaniga). So, according to my premise left hemisphere should be relatively “sparse”. This is supported in “Cortex & Mind“, p 185: “Pyramidal cells in language areas have been found to be larger on the left than on the right (Hayes & Lewis, 1995; Hustler & Gazzaniga, 1997)”. Their dendritic trees also extend further than those of right-hemisphere pyramids (Jacobs & Schneibel, 1993).

Another study: "Hemispheric asymmetries in cerebral cortical networks" found that columns in left hemisphere contain fewer minicolumns and better myelinated axons than corresponding areas of right hemisphere. Total volume and the number of synapses seems to be the same for both hemispheres. The hemispheres are densely interconnected by Corpus Callosum. Some of this connectivity provides for simple sensory-motor field integration and fault-tolerance. But greater “lateralization” in humans, vs. other primates, suggests that our hemispheres are integrated into combined hierarchy, resulting in ultimately greater power of generalization. A Finish study
found that ambidexterity (correlated with lesser hemispheric asymmetry) doubles the risk of ADHD and lower academic performance in children.

Autism as a dense-connectivity cognitive style.

The best evidence for individual differences in cognitive focus comes from research on autism, which is known to increase attention to details, often at the expense of higher generalization. So, it’s a good proxy for a "specialist phenotype", which according to my thesis should display greater short-range vs. long-range connectivity. Below, I summarize some evidence for such bias in autism.

Casanova in "Abnormalities Of Cortical Circuitry In The Brains Of Autistic Individuals" reports that autistic individuals have more numerous but smaller and more densely packed minicolumns, each containing smaller than normal neurons with shorter axons (I came across it via A Shade of Gray: excellent review of relevant research, highly recommend).
Related study "Comparison of the Minicolumnar Morphometry of Three Distinguished Neuroscientists and Controls" by Casanova is reported in "Minicolumns, Genius, and Autism". The connectivity pattern of the neuroscientists appears to be similar to autistics in the density and size of minicolumns, but different in better inhibitory isolation between adjacent minicolumns. This should compensate for smaller size, while enabling greater number of minicolumns.

Similar finding of more compact and better insulated minicolumns in primates and cetacea, compared to cats and lower mammals, was reported in A comparative perspective on minicolumns and inhibitory GABAergic interneurons in the neocortex.
The thesis of local vs. global connectivity bias in autism is also supported in Exploring the Folds of the Brain--And Their Links to Autism
by Hilgetag and Barbas: "in autistic people, communication between nearby cortical areas increases, whereas communication between distant areas decreases".
Such cortex should be more reliant on cortico-thalamo-cortical vs. cortico-cortical connections, which might be the implication in Partially enhanced thalamocortical functional connectivity in autism.

Henry Markram, a leading neuroscientist, a father of autistic son, and “pretty much an autist myself”, proposed
The intense world theory - a unifying theory of the neurobiology of autism.:

“The proposed neuropathology is hyper-functioning of local neural microcircuits, best characterized by hyper-reactivity and hyper-plasticity. Such hyper-functional microcircuits are speculated to become autonomous and memory trapped leading to the core cognitive consequences of hyper-perception, hyper-attention, hyper-memory and hyper-emotionality. The theory is centered on the neocortex and the amygdala, but could potentially be applied to all brain regions. The severity on each axis depends on the severity of the molecular syndrome expressed in different brain regions, which could uniquely shape the repertoire of symptoms of an autistic child. The progression of the disorder is proposed to be driven by overly strong reactions to experiences that drive the brain to a hyper-preference and overly selective state, which becomes more extreme with each new experience and may be particularly accelerated by emotionally charged experiences and trauma. This may lead to obsessively detailed information processing of fragments of the world and an involuntarily and systematic decoupling of the autist from what becomes a painfully intense world. The autistic is proposed to become trapped in a limited, but highly secure internal world with minimal extremes and surprises.”

"In the early phase of the child's life, repetition is a response to extreme fear. The autist perceives, feels and fears too much. Let them have their routines, no computers, television, no sharp colors, no surprises. It's the opposite of what parents are told to do. We actually think if you could develop a filtered environment in the early phase of life you could end up with an incredible genius child without many of the sensory challenges."

"The main critical periods for the brain during which time circuits form irreversibly are in the first few years (till about the age of 5 or so). We think this is an important age period when autism can either fully express to become a severe handicap or turned to become a major advantage. We think a calm filtered environment will not send the circuits into hyper-active modes, but the brain will keep most of its potential for plasticity. At later ages, filtered environments should help calm the autistic child and give them a starting point from where they can venture out. Each autistic child probably will first need its own bubble environment before one can start mixing bubbles. It should happen mostly on its own, but with very gentle guidance and encouragement. Do all you would want for your child ... but in slow motion ... let the child set the pace ... they need that control to feel secure enough to begin to venture off into any other bubbles."

Recent study found a reason for such intense perception: reduced synaptic spine pruning in autistic brains, secondary to mTOR over-expression. This seems to happen at the age of 3-4 years, when synaptic spine density normally decreases by ~50%. I suspect reduced prunning may also happen during prenatal development. If so, greater synaptic density should increase activity, thereby reducing normal prenatal die-off of neurons. This would explain minicolumnar differences noted by Casanova.

Either way, more numerous synaptic spines increase density of connections in the cortex, which must ultimately come at the expense of their effective range.
Truly pathological autism probably requires more than increased synaptic density and activity. The ultimate cause might be something as basic as pre|post- natal viral infection or retroviral expression, combined with low vitamin D levels (as is likely the case for schizophrenia and bipolar disorder).

Schizophrenia as a sparse-connectivity disorder.

On the opposite end of cognitive spectrum, excessively sparse connectivity in the cortex seems to increase the risk of schizophrenia.

One such risk factor is recently discovered greater ratio of astrocytes to neurons in schizophrenia, specifically in prefrontal cortex (astrocytes is a type of glial cells, covered above). This imbalance, demonstrated in RIKEN study on stem cells and post-mortem brains of patients vs. controls, seems to be due to reduced expression of gene DGCR8.
Fewer neurons means less robust networks, more vulnerable to acute damage, but more astrocytes can better maintain remaining neurons for regular wear and tear of our long life.

Duke University study found a more direct “sparse” risk factor for schitzophrenia: increased synaptic prunning. "‘Spine pruning theory is supported by the observation that the frontal brain regions of people with schizophrenia have fewer dendritic spines, the tentacles on the receiving ends of neurons that process signals from other cells". But this increased prunning happens during puberty, probably secondary to increased testosterone, vs. reduced prunning at 3-4 year old in autism.

Since schizophrenia is a uniquely human disorder, there must have been a reason for these risk factors to evolve. Other things that are unique for humans are large neocortex, complex society, and long life.
I think this risk and benefits are closely related: decreased density leaves more space and resources (such as astrocytes) for remaining neurons and connections, so they may grow longer. Which enables global intellectual integrity, thus dynamic social coordination and long-term planning in general.

More specific “sparse disorder” may be dyslexia. This connection was also made by Manuel Casanova: “Autism and dyslexia: A spectrum of cognitive styles as defined by minicolumnar Morphometry“, although there is a lot less research on that. Basically, he thinks that dyslexia is caused or exacerbated by a very “lossy” cognitive style, at least in some sensory association cortices.

Implications and speculations

Generalist vs. specialist trade-offs are somewhat ambiguous in terms modern societal utility:
- On one hand, speed & precision was more important for survival in the wild, which may explain why apes seem to have photographic memory, superior to humans: Chimps beat humans in memory test.
- On the other hand, more recent functional differentiation of modern society once again requires increasingly “lossless” knowledge acquisition. Social positions that do require higher generalization are relatively few, including law, management, sales & marketing, politics and related academic disciplines.

In terms of gender, men are obviously overrepresented among extreme specialists, and even more so among generalists. This should be expected: extremes, especially those in environmental detachment, are quite risky, and risk is a male domain. Males don’t contribute nearly as much to reproduction as females, in some species nothing but their genes. But they have additional or alternative purpose in evolution, - to serve as a test vehicle for variations, initially genetic and lately also memetic.

Relatively speaking, women don’t take chances. They have two X chromosomes to conceal mutations, more symmetrical brains as a backup for damage, stronger immune system and higher HDL. This also applies to behavioral differences: lower testosterone and vasopressin to avoid risk, higher estrogen and oxytocin to seek and provide support, generally heightened senses to pay more attention to their bodies and immediate environment. Another salient difference is recently discovered higher myelination in female thalamus, likely related to faster and more frequent attention switching in women. All that must come at the expense of intellectual detachment and higher generalization.

Paradoxically, generalist bias may also be associated with smaller brain size, due to shorter global links. For example, it is known that low-generality savant abilities can be induced by inhibiting prefrontal cortex, presumably because top-down focus selectively inhibits bottom-up perception. Inversely, shorter distances improve signal propagation across global networks (such as fronto-parietal, fronto-temporal, and salience networks), suppressing bottom-up detail by top-down filtering. Increased selectivity is also necessary to compensate for reduced overall memory capacity of a smaller brain.

Another benefit of smaller size is potentially better quality of development. Given the same time-to-maturity, there is a well-known “slow growth vs. sloppy growth” trade-off in biology. Basically, slower growth allows for more time and effort to prevent and correct mistakes made  during cellular division and other anabolic processes. For example, slower-growing axons are less affected by fluctuations in gradients guiding their growth cones. So, they will be straighter, further reducing the length of connections among cortical areas. And shorter connections are associated with higher IQ.

Of course, this is contrary to conventional bigger-is-better view, supported by increasing brain size in human evolution. But this trend reversed after Neolithic revolution, which might not be a bad thing. Some margin of that increased brain and body size is net-beneficial only for fight or flight emergencies. Given drastically improved security of settled society, that margin should become net-detrimental, by impairing cellular-level quality. For example, although animals of larger species generally live longer, smaller individuals of the same specie live longer than the larger ones in the absence of predation.

This is also true for women. But, women have proportionally less white matter and more grey matter than men, which compensates for shorter distances. And I think subcortical differences are even more important: lower testosterone and higher oxytocin makes women more sensitive to their immediate environment, especially social. They’re better at bottom-up perception, but are less free in top-down selection. Women do seem to have better integrity, but within a narrower range of interests.

On another note, IQ tests are inherently incapable of capturing higher generalization ability because they are time-limited. The tests are supposed to be background-neutral, except for verbal and math IQ. Thus, they can only measure our ability to discover patterns within data given to a subject during relatively brief test. That means they’re biased toward the speed of learning, and sparse & slow subjects will be at disadvantage. This is effectively confirmed by the finding that lobotomy, a procedure that disables prefrontal cortex (the seat of the highest generalization levels), has little or no impact on IQ.

The same bias is built into any educational system: the detail-oriented "dense" subjects are better at passive knowledge acquisition. "Sparse" architecture is an advantage for independent research and critical thinking, but those are far more difficult to evaluate. Also, modern science has already accumulated a very substantial body of knowledge, which must be passively acquired before one can make a novel contribution. That's a major problem for a generalist. Einstein’s observation that “imagination is more important than knowledge” may no longer hold in established fields (not mine).

There's been a lot of talk about association between "genius" and autism, which I think is misleading for two reasons. First, the diagnosis of autism includes asocial behavior, which is irrelevant: anyone with unusual interests will be correspondingly "asocial". It also includes avoidance of novelty, which is emotionally overwhelming for an autist. But a detached generalist would also avoid novelty, for the opposite reason: it is likely to be a trivial distraction relative to his own thoughts.

Second, it is far easier to recognize exceptional abilities of a specialist than those of a generalist. We all share lower generality levels, - that's where the data comes from. But effective generality of top association cortices definitely differs among individuals, and it takes an equally competent generalist to evaluate quality of generalizations. I think that’s partly why quality of work in psycho-social sciences, and especially in philosophy, is so vastly inferior to that in more precise (lossless) "hard" sciences. So, an autistic genius is far more likely gain societal recognition than an “anti-autistic” one.

Needless to say, my research of this subject is motivated by introspection.

No comments:

Post a Comment