Cortical trade-offs in generalist vs. specialist bias

Neocortex is loosely organized as a hierarchy of generalization, in which stimuli selectively propagate from primary to association areas ("Cortex & Mind", Cortical Memory by Joaquin Fuster). Higher areas represent increasingly general patterns or concepts: spatial and temporal receptive field per neuron and cortical column expands with their elevation in cortical hierarchy.

Generalization is basically a pattern discovery process. In neural implementation, pattern is a coincidence of multiple inputs, or presynaptic spikes. Sufficient number of coincident spikes triggers Hebbian learning: “fire together, wire together“ between simultaneously spiking neurons. More precisely, a synapse is strengthened if pre-synaptic neuron fires just before the post-synaptic one.

Most of neocortex is connections between neurons (dendrites and axons), plus their life support. Given limited resources within a skull, there must be a tradeoff between total number of connections and their average length. In other words, a network can be relatively dense, with more connections of shorter average length, or sparse, with fewer total connections of greater average length.

The choice of coincident inputs becomes exponentially greater with the length of connections. Hence, stronger patterns (greater number and closer timing of coincident input spikes) can be discovered. But that range must come at the cost of having fewer total connections, thus less detailed memory. Which requires greater selectivity in learning: longer reinforcement to form and strengthen synapses.

So, other things being equal, there must be a tradeoff between speed and detail of learning, and scope and stability of learned patterns. Relatively dense hierarchy prioritizes speed and detail, I call it a “specialist bias”, while sparse hierarchy selects for scope and persistence: my “generalist bias”.

Cellular factors in density vs. range tradeoff.

Initial determinant of cortical density is the rate of division and survival for neuronal progenitor cells during cortical development. Slower division or faster die-off leaves fewer progenitor cells, which will form fewer cortical neurons. That should leave more space and resources to grow longer connections among them. This is likely determined by nerve growth factors and receptors: higher activity should form denser network. One such factor maybe CATNAP2 gene, expressed mostly in prefrontal and parietal cortices, and probably correlated with autism: Genes for autism or genes for connectivity.

The most recognizable feature of neocortex is its six layers. Deeper and older layers VI and V mostly mediate cortico-subcortical integration, layer IV propagates data flow upward the cortical hierarchy via thalamus, and newer layers II and III provide intra-cortical connectivity, mostly via layer I axons.

Henry Markram reported innate ”peak connectivity” of layer V pyramidal cells at 300-500 mu. There are ~50 cell clusters (representation units) interlaced within that distance. It seems to me that these clusters provide minimal representation redundancy and mutual support via reverberating firing. Each cluster probably responds to some specific intensity of stimulus. These clusters inhibit each other within a column: “Sparse distributed coding model…“ to adjust for redundancy within receptive field. So, variation in the range of such peak connectivity may be one of dense vs. sparse factors.

A unique feature in human brain (and to a lesser extent in other primates and whales) is spindle cells. Wikipedia: "Spindle cells emerge postnatally and eventually become widely connected with diverse parts of the brain, evidencing their essential contributions to the superior capacity of hominids to focus on difficult problems." Axons of spindle cells are less branched than those of pyramidal neurons, and their extended range must come at the expense of reduced density of other connections. This trade-off probably enables better top-down (general-to-specific) focus in humans.

Another possible factor in the trade-off is the ratio of glia to neurons, which also determines sparsity. This is an excerpt from the "The Root of Thought": "As we move up the evolutionary ladder, in a widely researched worm, Caenorhabditis elegans, glia are 16 percent of the nervous system. The fruit fly’s brain has about 20 percent glia. In rodents such as mice and rats, glia make up 60 percent of the nervous system. The nervous system of the chimpanzee has 80 percent glia, with the human at 90 percent. The ratio of glia to neurons increases with our definition of intelligence."

However, his interpretation that glia a main information processing component in human brain is implausible, I agree with mainstream opinion that they mainly provide support for neurons.
Greater proportion of glia would reduce density of neurons, but enable higher activity and longer-range connections for each of them. Again, that means a sparser neural network.

Differences between higher and lower cortical regions

Very roughly, cortical hierarchy consists of four sub-hierarchies, listed from the bottom up:

- spectrum of primary-to-association cortices, within each of sensory and motor cortices
- posterior sensory and anterior motor cortices, the latter is somewhat higher in generalization
- lateral task-positive and medial default-mode networks, the latter is somewhat higher
- right and left hemispheres, the latter is somewhat higher

Joaquin Fuster on the differences between primary and association areas in Cortex & Mind, p. 73:
“At the lower level, representation is highly concrete and localized, and thus highly vulnerable. Local damage leads to well-delimited sensory deficit. In unimodal association cortex, representation is more categorical and more distributed, in networks that span relatively large sections of that cortex… In transmodal areas representation is even more widely distributed… P. 82: “Thus a higher-level cognit (e.g., an abstract concept) would be represented in a wide network of association cortex…”
In my terms, wider networks imply “sparse bias” on higher levels of generalization.

Similar quotation via “How to Create a Mind” by Ray Kurzweil, p. 86: A study of 25 visual and multi-modal cortical areas by Daniel Felleman found that “As they went up the neocortical hierarchy,.. processing of  patterns comprised larger spatial areas and involved longer time periods“.
Another study by Uri Hasson stated: “It is well established that neurons along the visual cortical pathways have increasingly larger spatial receptive field.” and found that “similar to cortical hierarchy of spatial receptive fields, there is a hierarchy of progressively longer temporal receptive windows”.

The neocortex is myelinated sequentially from primary to association areas at correspondingly increasing age (up to ~30 year old for prefrontal cortex), and myelination then seems to decline in reversed order ("Human Neurophysiology", page 197). Allowing for a multi-year delay in knowledge accumulation, this probably reflects and / or determines the age at which abilities peak in fields that require knowledge of corresponding generality. It's known that athletic abilities (primary cortices) peak in early 20s, and mathematical skills (likely parietal cortex) a bit latter.

On the other hand, performance in business, politics, social sciences, and literature (prefrontal cortex?) doesn't peak until late in life. This is probably even more true in philosophy, but performance metrics there are questionable. Also supportive is the observation that cortical development sequence is delayed by several years in subjects with ADHD. Obviously, effective generality of discovered concepts, thus also development of higher association areas, depends on attention span.

Comparison between parietal and prefrontal cortex (highest levels of sensory and motor cortices):

Buschman & Miller of MIT, ref:have found two types of attention in two separate regions of the brain. The prefrontal cortex is in charge of willful concentration; if you are studying for a test or writing a novel, the impetus and the orders come from there. But if there is a sudden, riveting event—the attack of a tiger or the scream of a child—it is the parietal cortex that is activated. The MIT scientists have learned that the two brain regions sustain concentration when the neurons emit pulses of electricity at specific rates—faster frequencies for the automatic processing of the parietal cortex, slower frequencies for the deliberate, intentional work of the prefrontal."
I think lower frequency here is due to longer feedback loop of higher levels.

I don’t have much info on task-positive vs. default-mode networks, the latter is not well understood. Next is cortical hemispheric asymmetry, AKA lateralization:

Left hemisphere represents higher-generality and long-term-goal- associated concepts, while the right one mostly searches in the background, for lower-level contextual patterns (Cortex & Mind, p. 184, Split Brain, Gazzaniga). According to my premise, left hemisphere should be relatively “sparse”, which is supported in “Cortex & Mind“, p 185: “Pyramidal cells in language areas have been found to be larger on the left than on the right (Hayes & Lewis, 1995; Hustler & Gazzaniga, 1997)”. Their dendritic trees also extend further than those of right-hemisphere pyramids (Jacobs & Schneibel, 1993).

Another study: "Hemispheric asymmetries in cerebral cortical networks" found that columns in left hemisphere contain fewer minicolumns and better myelinated axons than corresponding areas of right hemisphere, with same volume and number of synapses.
Hemispheres are densely interconnected by Corpus Callosum. This is partly for sensory-motor field integration and duplication (fault-tolerance). But greater “lateralization” in humans, vs. other primates, suggests that our hemispheres also combine in a deeper hierarchy of generalization. Finish study found that ambidexterity (correlated with lesser lateralization) doubles the risk of ADHD and lower academic performance in children.

Autism as a dense-connectivity cognitive style.

The best evidence for individual differences in cognitive focus comes from research on autism, which is known to increase attention to details, often at the expense of higher generalization. So, it’s a good proxy for a "specialist phenotype", which according to my thesis should display greater short-range vs. long-range connectivity. Below, I summarize some evidence for such bias in autism.

Casanova in "Abnormalities Of Cortical Circuitry In The Brains Of Autistic Individuals" reports that autistic individuals have more numerous but smaller and more densely packed minicolumns, each containing smaller than normal neurons with shorter axons (I came across it via A Shade of Gray: excellent review of relevant research, highly recommend).
Related study "Comparison of the Minicolumnar Morphometry of Three Distinguished Neuroscientists and Controls" by Casanova is reported in "Minicolumns, Genius, and Autism". The connectivity pattern of the neuroscientists appears to be similar to autistics in the density and size of minicolumns, but different in better inhibitory isolation between adjacent minicolumns. This should compensate for smaller size, while enabling greater number of minicolumns.

Similar finding of more compact and better insulated minicolumns in primates and cetacea, compared to cats and lower mammals, was reported in A comparative perspective on minicolumns and inhibitory GABAergic interneurons in the neocortex.
The thesis of local vs. global connectivity bias in autism is also supported in Exploring the Folds of the Brain--And Their Links to Autism by Hilgetag and Barbas: "in autistic people, communication between nearby cortical areas increases, whereas communication between distant areas decreases".
Such cortex should be more reliant on cortico-thalamo-cortical vs. cortico-cortical connections, which might be the implication in Partially enhanced thalamocortical functional connectivity in autism.

Henry Markram, a leading neuroscientist, a father of autistic son, and “pretty much an autist myself”, proposed The intense world theory - a unifying theory of the neurobiology of autism.:

“The proposed neuropathology is hyper-functioning of local neural microcircuits, best characterized by hyper-reactivity and hyper-plasticity. Such hyper-functional microcircuits are speculated to become autonomous and memory trapped leading to the core cognitive consequences of hyper-perception, hyper-attention, hyper-memory and hyper-emotionality. The theory is centered on the neocortex and the amygdala, but could potentially be applied to all brain regions. The severity on each axis depends on the severity of the molecular syndrome expressed in different brain regions, which could uniquely shape the repertoire of symptoms of an autistic child. The progression of the disorder is proposed to be driven by overly strong reactions to experiences that drive the brain to a hyper-preference and overly selective state, which becomes more extreme with each new experience and may be particularly accelerated by emotionally charged experiences and trauma. This may lead to obsessively detailed information processing of fragments of the world and an involuntarily and systematic decoupling of the autist from what becomes a painfully intense world. The autistic is proposed to become trapped in a limited, but highly secure internal world with minimal extremes and surprises.”

"In the early phase of the child's life, repetition is a response to extreme fear. The autist perceives, feels and fears too much. Let them have their routines, no computers, television, no sharp colors, no surprises. It's the opposite of what parents are told to do. We actually think if you could develop a filtered environment in the early phase of life you could end up with an incredible genius child without many of the sensory challenges."

"The main critical periods for the brain during which time circuits form irreversibly are in the first few years (till about the age of 5 or so). We think this is an important age period when autism can either fully express to become a severe handicap or turned to become a major advantage. We think a calm filtered environment will not send the circuits into hyper-active modes, but the brain will keep most of its potential for plasticity. At later ages, filtered environments should help calm the autistic child and give them a starting point from where they can venture out. Each autistic child probably will first need its own bubble environment before one can start mixing bubbles. It should happen mostly on its own, but with very gentle guidance and encouragement. Do all you would want for your child ... but in slow motion ... let the child set the pace ... they need that control to feel secure enough to begin to venture off into any other bubbles."

Recent study found one reason for such intense perception: reduced synaptic spine pruning in autistic brains, secondary to mTOR over-expression. This seems to happen at the age of 3-4 years, when synaptic spine density normally decreases by ~50%. I suspect reduced pruning may also happen during prenatal development. If so, greater synaptic density should increase activity, thereby reducing normal prenatal die-off of neurons. This would explain minicolumnar differences noted by Casanova.

Either way, more numerous synaptic spines increase density of connections in the cortex, which must ultimately come at the expense of their effective range.
Truly pathological autism probably requires more than increased synaptic density and activity. The ultimate cause might be something as basic as pre|post- natal viral infection or retroviral expression, combined with low vitamin D levels (as is likely the case for schizophrenia and bipolar disorder).

Sparse connectivity as a risk factor for schizophrenia.

One such factor is greater ratio of astrocytes to neurons in schizophrenia, specifically in prefrontal cortex (astrocytes is a type of glial cells, covered above). This imbalance was recently discovered in RIKEN study on stem cells and post-mortem brains of patients vs. controls. It seems to be due to reduced expression of gene DGCR8. Fewer neurons make network more vulnerable to acute damage, but more astrocytes can better maintain remaining neurons for regular wear and tear of our long life.

Duke University study found a more direct “sparse” risk factor for schizophrenia: increased synaptic pruning "‘Spine pruning theory is supported by the observation that the frontal brain regions of people with schizophrenia have fewer dendritic spines, the tentacles on the receiving ends of neurons that process signals from other cells". But this increased pruning happens during puberty, probably secondary to increased testosterone, vs. reduced pruning at 3-4-year-old in autism.

Schizophrenia seems to be a uniquely human disorder, and there must be a reason these risk factors evolved. Other things that are unique for humans are large neocortex, complex society, and long life.
I think this risk and benefits are closely related: decreased density leaves more space and resources (such as astrocytes) for remaining neurons and connections, so they may grow longer. Which enables global intellectual integrity, thus dynamic social coordination, and long-term planning in general.

More specific “sparse disorder” may be dyslexia. This connection was also made by Manuel Casanova: “Autism and dyslexia: A spectrum of cognitive styles as defined by minicolumnar Morphometry“, although there is a lot less research on that. Basically, he thinks that dyslexia is caused or exacerbated by a “lossy” cognitive style, secondary to sparse connectivity, at least in language-oriented cortices.

Implications and speculations

Generalist vs. specialist tradeoffs are somewhat ambiguous in terms modern societal utility:
- On one hand, speed & precision was more important for survival in the wild, which may explain why apes seem to have photographic memory, superior to humans: Chimps beat humans in memory test.
- On the other hand, more recent functional differentiation of modern society once again requires increasingly “lossless” knowledge acquisition. Social positions that do require higher generalization are relatively few, including law, management, politics and related academic disciplines.

In terms of gender, men are obviously overrepresented among extreme specialists, and even more so among generalists. This should be expected: extremes, especially that of environmental detachment, are risky, and risk is a male domain. Males don’t contribute nearly as much to reproduction as females, in some species only their genes. Their additional purpose in evolution is to serve as a test vehicle for variations, initially genetic and later also memetic. Males are a far better target for sexual selection because their mutations are more likely to be expressed: they have only one X chromosome.

Relatively speaking, women don’t take chances. They have two X chromosomes to conceal mutations, more symmetrical brains as a backup for damage, stronger immune system and higher HDL. Same for behavior: they have lower testosterone and vasopressin to avoid risk, higher estrogen and oxytocin to seek and provide support, generally heightened senses to pay more attention to their bodies and immediate environment. Another salient difference is recently discovered higher myelination in female thalamus, likely related to faster and more frequent attention switching in women. All that must come at the expense of intellectual detachment and higher generalization.

Paradoxically, generalist bias may also be associated with smaller brain size, due to shorter global links. For example, it is known that low-generality savant abilities can be induced by inhibiting prefrontal cortex, presumably because top-down speculation competes with bottom-up perception. Inversely, shorter distances improve signal propagation across global networks (such as fronto-parietal, fronto- temporal, salience networks), suppressing bottom-up detail by top-down filtering. And selection for stronger patterns is necessary to compensate for reduced overall memory capacity of a smaller brain.

Another benefit of smaller size is potentially better quality of development over the same time: there is a well-known “slow growth vs. sloppy growth” trade-off in biology. Basically, slower growth allows for more time and resources to prevent and correct mistakes made  during cellular division and other anabolic processes. For example, slower-growing axons are less affected by short- term fluctuation in gradients guiding their growth cones. So, they will be straighter, further reducing the length of connections among cortical areas. And shorter connections are associated with higher IQ.

Of course, this is contrary to conventional bigger-is-better view, supported by increasing brain size in human evolution. But this trend reversed after Neolithic revolution, which might not be a bad thing. Some margin of increased brain and body size is net-beneficial only for fight / flight emergencies. Given drastically improved security of settled society, that margin should become net-detrimental, by impairing cellular quality. For example, although animals of larger species generally live longer, smaller individuals of the same specie live longer than the larger ones in the absence of predation.

This is also true for women. But, women have proportionally less white matter and more grey matter than men, which compensates for shorter distances. And I think subcortical differences are even more important: lower testosterone and higher oxytocin makes women more sensitive to their immediate environment, especially social one. They’re better at bottom-up perception, but less free in top-down selection. Women do seem to have better integrity, but within a narrower range of interests.

On another note, IQ tests are inherently incapable of capturing higher generalization ability because they are time-limited. The tests are supposed to be background-neutral, except for verbal and math IQ. Thus, they can only measure our ability to discover patterns within data given to a subject during relatively brief test. That means they’re biased toward the speed of learning, where “sparse” subjects are at disadvantage. This is effectively confirmed by the finding that lobotomy, which disables prefrontal cortex (the seat of highest generalization levels), has little or no impact on IQ.

The same bias is built into any educational system: detail-oriented "dense" bias is better for passive knowledge acquisition. "Sparse" bias is better at independent research, but that’s far more difficult to evaluate. And modern science amassed a huge body of knowledge, which must be acquired before one can make a novel contribution. That's a major disadvantage for a generalist. Einstein’s assertion that “imagination is more important than knowledge” may no longer hold in established fields (not mine).

There's been a lot of talk about association between "genius" and autism, which I think is misleading for two reasons. First, the diagnosis of autism includes asocial behavior, which is irrelevant: anyone with unusual interests will be correspondingly "asocial". Closely related is avoidance of novelty, which is emotionally overwhelming for an autist. But a detached generalist would also avoid society and novelty, for the opposite reason: they are likely to be a trivial distraction to his own thoughts.

Second, it is far easier to recognize exceptional abilities of a specialist than those of a generalist. We all share lower generality levels, - that's where the data comes from, leaving less to interpretation. But effective generality of top association cortices definitely differs among individuals, and it takes an equally competent generalist to evaluate quality of generalizations. Which is why the work in psycho-social sciences, and especially in philosophy, is so vastly inferior to that in relatively lossless hard sciences. So, an autistic genius is far more likely to gain recognition than “anti-autistic” one.

Needless to say, this write-up is motivated by introspection.

No comments:

Post a Comment