Cortical trade-offs in generalist vs. specialist bias

Neocortex, the most advanced ¾ of human brain, is loosely organized in a hierarchy of generalization. In this hierarchy, stimuli selectively propagate from primary to association areas ("Cortex & Mind", Cortical Memory by Joaquin Fuster). Higher areas represent increasingly general patterns: spatial and temporal receptive field per neuron and cortical column expands with their elevation in this hierarchy.

Generalization is just another word for pattern discovery, and basic pattern is a coincidence of multiple inputs. For a neuron, the inputs are presynaptic spikes. Sufficient number of coincident spikes triggers Hebbian learning: “fire together, wire together“ between simultaneously spiking neurons. More precisely, a synapse is strengthened if pre-synaptic neuron fires just before the post-synaptic one.

Most of neocortex is connections between neurons (dendrites and axons), plus their life support. Given limited resources within a skull, there must be a tradeoff between total number of connections and their average length. In other words, cortex can be relatively dense, with more connections but shorter average length, or sparse, with fewer total connections of greater average length.

The choice of coincident inputs becomes exponentially greater with the length of connections. Hence, stronger patterns (greater number and closer timing of coincident input spikes) can be discovered. But that greater range must come at the cost of having fewer total connections, thus less detailed memory. Which requires greater selectivity in learning: longer reinforcement to form and strengthen synapses.

So, other things being equal, there must be a tradeoff between speed and detail of learning, and relative  scope and stability of learned patterns. The former is prioritized in a dense hierarchy: “specialist bias”, and the latter in a sparse hierarchy: “generalist bias”.

Cellular factors in density vs. range tradeoff.

Initial determinant of cortical density is the rate of division and survival for neuronal progenitor cells during cortical development. Slower division or faster die-off leaves fewer progenitor cells, which will form fewer cortical neurons. That should leave more space and resources to grow longer connections among them. This is likely determined by nerve growth factors and receptors: higher activity should form denser network. One such factor may be coded by CATNAP2 gene, expressed mostly in prefrontal and parietal cortices, and probably correlated with autism: Genes for autism or genes for connectivity.

The most recognizable feature of neocortex is its six layers. Deeper and older layers VI and V mostly mediate cortico-subcortical integration, layer IV propagates data flow upward the cortical hierarchy via thalamus, and newer layers II and III provide intra-cortical connectivity, mostly via layer I axons.

Henry Markram recently reported innate ”peak connectivity” of layer V pyramidal cells at 300-500 mu. There are ~50 cell clusters (representation units) interlaced within that distance. It seems to me that these clusters provide minimal representation redundancy and mutual support via reverberating firing. Each cluster probably responds to some specific intensity of stimulus. These clusters inhibit each other within a column: “Sparse distributed coding model…“ to adjust for redundancy within receptive field.
So, variation in the range of such peak connectivity may be one of main factors in dense vs. sparse bias.

A unique feature in human brain (and to a lesser extent in other primates and whales) is spindle cells. Wikipedia: "Spindle cells emerge postnatally and eventually become widely connected with diverse parts of the brain, evidencing their essential contributions to the superior capacity of hominids to focus on difficult problems." Axons of spindle cells are less branched than those of pyramidal neurons, and their extended range must come at the expense of reduced density of other connections. This trade-off probably enables better top-down (general-to-specific) focus in humans.

Another possible factor in the trade-off is the ratio of glia to neurons, which I think is also a good sign of a "sparse" architecture. This is an excerpt from the "The Root of Thought": "As we move up the evolutionary ladder, in a widely researched worm, Caenorhabditis elegans, glia are 16 percent of the nervous system. The fruit fly’s brain has about 20 percent glia. In rodents such as mice and rats, glia make up 60 percent of the nervous system. The nervous system of the chimpanzee has 80 percent glia, with the human at 90 percent. The ratio of glia to neurons increases with our definition of intelligence."

However, his interpretation that glia a main information processing component in human brain is implausible, I agree with mainstream opinion that they mainly provide support for neurons.
Greater proportion of glia would reduce density of neurons, but enable higher activity and longer-range connections for each of them. Again, that means a sparser network.

I might be wrong on much of the specifics (above and below), but that wouldn't affect the "sparse vs. dense" premise. This tradeoff may differ among cortical areas, but I think specificity of such differences is relatively low. That’s because the number of progenitor cells in the cortical sub-plate is set before any significant differentiation between the areas. Also, genetic variation among individuals is very minor, and whatever differences develop through postnatal learning are already affected by innate biases.

Differences between higher and lower cortical regions

Very roughly, cortical hierarchy consists of four sub-hierarchies, listed from the bottom up:

- spectrum of primary-to-association cortices, within each of sensory and motor cortices
- posterior sensory and anterior motor cortices, the latter is somewhat higher in generalization
- lateral task-positive and medial default-mode networks, latter is somewhat higher
- right and left hemispheres, latter is somewhat higher

Joaquin Fuster explained the differences between primary and hierarchically higher association areas in Cortex & Mind, p. 73: “At the lower level, representation is highly concrete and localized, and thus highly vulnerable. Local damage leads to well-delimited sensory deficit. In unimodal association cortex, representation is more categorical and more distributed, in networks that span relatively large sections of that cortex… In transmodal areas representation is even more widely distributed… P. 82: “Thus a higher-level cognit (e.g., an abstract concept) would be represented in a wide network of association cortex…” In my terms, wider networks imply “sparse bias” on higher levels of generalization.

Similar quotation via “How to Create a Mind” by Ray Kurzweil, p. 86: A study of 25 visual and multi-modal cortical areas by Daniel Felleman found that “As they went up the neocortical hierarchy,.. processing of  patterns comprised larger spatial areas and involved longer time periods“.
Another study by Uri Hasson stated: “It is well established that neurons along the visual cortical pathways have increasingly larger spatial receptive field.” and found that “similar to cortical hierarchy of spatial receptive fields, there is a hierarchy of progressively longer temporal receptive windows”.

The neocortex is myelinated sequentially from primary to association areas at correspondingly increasing age (up to ~30 year old for prefrontal cortex), and myelination then seems to decline in reversed order ("Human Neurophysiology", page 197). Allowing for a multi-year delay in knowledge accumulation, this probably reflects and / or determines the age at which abilities peak in fields that require knowledge of corresponding generality. It's known that athletic abilities (primary cortices) peak in early 20s, and mathematical skills (likely parietal cortex) a bit latter.

On the other hand, performance in business, politics, social sciences, and literature (anterior prefrontal cortex?) doesn't peak until late in life. This is probably even more true in philosophy, but performance metrics there are questionable. Also supportive is the observation that this cortical development sequence is delayed by several years in subjects with ADHD. Obviously, effective generality of discovered concepts, thus also development of higher assiciation areas, depends on attention span.

Comparison between parietal and prefrontal cortex (highest levels of sensory and motor cortices):

Buschman & Miller of MIT, ref:have found two types of attention in two separate regions of the brain. The prefrontal cortex is in charge of willful concentration; if you are studying for a test or writing a novel, the impetus and the orders come from there. But if there is a sudden, riveting event—the attack of a tiger or the scream of a child—it is the parietal cortex that is activated. The MIT scientists have learned that the two brain regions sustain concentration when the neurons emit pulses of electricity at specific rates—faster frequencies for the automatic processing of the parietal cortex, slower frequencies for the deliberate, intentional work of the prefrontal." I think higher-frequency waves are associated with reaction speed and detail, and lower frequency with a longer feedback loop of higher levels.

I don’t have much info on task-positive vs. default-mode networks, the latter is not well understood. Cortical hemispheric asymmetry, AKA lateralization:

Left hemisphere represents higher-generality concepts, especially semantic ones. Right hemisphere works mostly in the background, likely searching for lower-level contextual patterns (Cortex & Mind, p. 184, Split Brain, Gazzaniga). According to my premise left hemisphere should be relatively “sparse”, which is supported in “Cortex & Mind“, p 185: “Pyramidal cells in language areas have been found to be larger on the left than on the right (Hayes & Lewis, 1995; Hustler & Gazzaniga, 1997)”. Their dendritic trees also extend further than those of right-hemisphere pyramids (Jacobs & Schneibel, 1993).

Another study: "Hemispheric asymmetries in cerebral cortical networks" found that columns in left hemisphere contain fewer minicolumns and better myelinated axons than corresponding areas of right hemisphere, with same volume and the number of synapses. Hemispheres are densely interconnected by Corpus Callosum, partly for sensory-motor field integration and duplication (fault-tolerance).
But greater “lateralization” in humans, vs. other primates, suggests that our hemispheres also combine in a deeper hierarchy of generalization. Finish study found that ambidexterity (correlated with lesser lateralization) doubles the risk of ADHD and lower academic performance in children.

Autism as a dense-connectivity cognitive style.

The best evidence for individual differences in cognitive focus comes from research on autism, which is known to increase attention to details, often at the expense of higher generalization. So, it’s a good proxy for a "specialist phenotype", which according to my thesis should display greater short-range vs. long-range connectivity. Below, I summarize some evidence for such bias in autism.

Casanova in "Abnormalities Of Cortical Circuitry In The Brains Of Autistic Individuals" reports that autistic individuals have more numerous but smaller and more densely packed minicolumns, each containing smaller than normal neurons with shorter axons (I came across it via A Shade of Gray: excellent review of relevant research, highly recommend).
Related study "Comparison of the Minicolumnar Morphometry of Three Distinguished Neuroscientists and Controls" by Casanova is reported in "Minicolumns, Genius, and Autism". The connectivity pattern of the neuroscientists appears to be similar to autistics in the density and size of minicolumns, but different in better inhibitory isolation between adjacent minicolumns. This should compensate for smaller size, while enabling greater number of minicolumns.

Similar finding of more compact and better insulated minicolumns in primates and cetacea, compared to cats and lower mammals, was reported in A comparative perspective on minicolumns and inhibitory GABAergic interneurons in the neocortex.
The thesis of local vs. global connectivity bias in autism is also supported in Exploring the Folds of the Brain--And Their Links to Autism by Hilgetag and Barbas: "in autistic people, communication between nearby cortical areas increases, whereas communication between distant areas decreases".
Such cortex should be more reliant on cortico-thalamo-cortical vs. cortico-cortical connections, which might be the implication in Partially enhanced thalamocortical functional connectivity in autism.

Henry Markram, a leading neuroscientist, a father of autistic son, and “pretty much an autist myself”, proposed The intense world theory - a unifying theory of the neurobiology of autism.:

“The proposed neuropathology is hyper-functioning of local neural microcircuits, best characterized by hyper-reactivity and hyper-plasticity. Such hyper-functional microcircuits are speculated to become autonomous and memory trapped leading to the core cognitive consequences of hyper-perception, hyper-attention, hyper-memory and hyper-emotionality. The theory is centered on the neocortex and the amygdala, but could potentially be applied to all brain regions. The severity on each axis depends on the severity of the molecular syndrome expressed in different brain regions, which could uniquely shape the repertoire of symptoms of an autistic child. The progression of the disorder is proposed to be driven by overly strong reactions to experiences that drive the brain to a hyper-preference and overly selective state, which becomes more extreme with each new experience and may be particularly accelerated by emotionally charged experiences and trauma. This may lead to obsessively detailed information processing of fragments of the world and an involuntarily and systematic decoupling of the autist from what becomes a painfully intense world. The autistic is proposed to become trapped in a limited, but highly secure internal world with minimal extremes and surprises.”

"In the early phase of the child's life, repetition is a response to extreme fear. The autist perceives, feels and fears too much. Let them have their routines, no computers, television, no sharp colors, no surprises. It's the opposite of what parents are told to do. We actually think if you could develop a filtered environment in the early phase of life you could end up with an incredible genius child without many of the sensory challenges."

"The main critical periods for the brain during which time circuits form irreversibly are in the first few years (till about the age of 5 or so). We think this is an important age period when autism can either fully express to become a severe handicap or turned to become a major advantage. We think a calm filtered environment will not send the circuits into hyper-active modes, but the brain will keep most of its potential for plasticity. At later ages, filtered environments should help calm the autistic child and give them a starting point from where they can venture out. Each autistic child probably will first need its own bubble environment before one can start mixing bubbles. It should happen mostly on its own, but with very gentle guidance and encouragement. Do all you would want for your child ... but in slow motion ... let the child set the pace ... they need that control to feel secure enough to begin to venture off into any other bubbles."

Recent study found one reason for such intense perception: reduced synaptic spine pruning in autistic brains, secondary to mTOR over-expression. This seems to happen at the age of 3-4 years, when synaptic spine density normally decreases by ~50%. I suspect reduced prunning may also happen during prenatal development. If so, greater synaptic density should increase activity, thereby reducing normal prenatal die-off of neurons. This would explain minicolumnar differences noted by Casanova.

Either way, more numerous synaptic spines increase density of connections in the cortex, which must ultimately come at the expense of their effective range.
Truly pathological autism probably requires more than increased synaptic density and activity. The ultimate cause might be something as basic as pre|post- natal viral infection or retroviral expression, combined with low vitamin D levels (as is likely the case for schizophrenia and bipolar disorder).

Sparse connectivity as a risk factor for schizophrenia.

One such factor is greater ratio of astrocytes to neurons in schizophrenia, specifically in prefrontal cortex (astrocytes is a type of glial cells, covered above). This imbalance was recently discovered in RIKEN study on stem cells and post-mortem brains of patients vs. controls. It seems to be due to reduced expression of gene DGCR8. Fewer neurons make network more vulnerable to acute damage, but more astrocytes can better maintain remaining neurons for regular wear and tear of our long life.

Duke University study found a more direct “sparse” risk factor for schitzophrenia: increased synaptic prunning. "‘Spine pruning theory is supported by the observation that the frontal brain regions of people with schizophrenia have fewer dendritic spines, the tentacles on the receiving ends of neurons that process signals from other cells". But this increased prunning happens during puberty, probably secondary to increased testosterone, vs. reduced prunning at 3-4 year old in autism.

Since schizophrenia is a uniquely human disorder, there must have been a reason for these risk factors to evolve. Other things that are unique for humans are large neocortex, complex society, and long life.
I think this risk and benefits are closely related: decreased density leaves more space and resources (such as astrocytes) for remaining neurons and connections, so they may grow longer. Which enables global intellectual integrity, thus dynamic social coordination and long-term planning in general.

More specific “sparse disorder” may be dyslexia. This connection was also made by Manuel Casanova: “Autism and dyslexia: A spectrum of cognitive styles as defined by minicolumnar Morphometry“, although there is a lot less research on that. Basically, he thinks that dyslexia is caused or exacerbated by a very “lossy” cognitive style, at least in some sensory association cortices.

Implications and speculations

Generalist vs. specialist tradeoffs are somewhat ambiguous in terms modern societal utility:
- On one hand, speed & precision was more important for survival in the wild, which may explain why apes seem to have photographic memory, superior to humans: Chimps beat humans in memory test.
- On the other hand, more recent functional differentiation of modern society once again requires increasingly “lossless” knowledge acquisition. Social positions that do require higher generalization are relatively few, including law, management, politics and related academic disciplines.

In terms of gender, men are obviously overrepresented among extreme specialists, and even more so among generalists. This should be expected: extremes, especially those in environmental detachment, are quite risky, and risk is a male domain. Males don’t contribute nearly as much to reproduction as females, in some species nothing but their genes. But they have additional or alternative purpose in evolution, - to serve as a test vehicle for variations, initially genetic and lately also memetic.

Relatively speaking, women don’t take chances. They have two X chromosomes to conceal mutations, more symmetrical brains as a backup for damage, stronger immune system and higher HDL. This also applies to behavioral differences: lower testosterone and vasopressin to avoid risk, higher estrogen and oxytocin to seek and provide support, generally heightened senses to pay more attention to their bodies and immediate environment. Another salient difference is recently discovered higher myelination in female thalamus, likely related to faster and more frequent attention switching in women. All that must come at the expense of intellectual detachment and higher generalization.

Paradoxically, generalist bias may also be associated with smaller brain size, due to shorter global links. For example, it is known that low-generality savant abilities can be induced by inhibiting prefrontal cortex, presumably because top-down speculation competes with bottom-up perception. Inversely, shorter distances improve signal propagation across global networks (such as fronto-parietal, fronto- temporal, salience networks), suppressing bottom-up detail by top-down filtering. And selection for stronger patterns is necessary to compensate for reduced overall memory capacity of a smaller brain.

Another benefit of smaller size is potentially better quality of development. Given the same time to maturity, there is a well-known “slow growth vs. sloppy growth” trade-off in biology. Basically, slower growth allows for more time and resources to prevent and correct mistakes made  during cellular division and other anabolic processes. For example, slower-growing axons are less affected by short- term fluctuation in gradients guiding their growth cones. So, they will be straighter, further reducing the length of connections among cortical areas. And shorter connections are associated with higher IQ.

Of course, this is contrary to conventional bigger-is-better view, supported by increasing brain size in human evolution. But this trend reversed after Neolithic revolution, which might not be a bad thing. Some margin of that increased brain and body size is net-beneficial only for fight or flight emergencies. Given drastically improved security of settled society, that margin should become net-detrimental, by impairing cellular quality. For example, although animals of larger species generally live longer, smaller individuals of the same specie live longer than the larger ones in the absence of predation.

This is also true for women. But, women have proportionally less white matter and more grey matter than men, which compensates for shorter distances. And I think subcortical differences are even more important: lower testosterone and higher oxytocin makes women more sensitive to their immediate environment, especially social one. They’re better at bottom-up perception, but are less free in top-down selection. Women do seem to have better integrity, but within a narrower range of interests.

On another note, IQ tests are inherently incapable of capturing higher generalization ability because they are time-limited. The tests are supposed to be background-neutral, except for verbal and math IQ. Thus, they can only measure our ability to discover patterns within data given to a subject during relatively brief test. That means they’re biased toward the speed of learning, where “sparse” subjects are at disadvantage. This is effectively confirmed by the finding that lobotomy, which disables prefrontal cortex (the seat of highest generalization levels), has little or no impact on IQ.

The same bias is built into any educational system: detail-oriented "dense" bias is better for passive knowledge acquisition. "Sparse" bias is better at independent research, but that’s far more difficult to evaluate. And modern science amassed a huge body of knowledge, which must be acquired before one can make a novel contribution. That's a major disadvantage for a generalist. Einstein’s assertion that “imagination is more important than knowledge” may no longer hold in established fields (not mine).

There's been a lot of talk about association between "genius" and autism, which I think is misleading for two reasons. First, the diagnosis of autism includes asocial behavior, which is irrelevant: anyone with unusual interests will be correspondingly "asocial". Closely related is avoidance of novelty, which is emotionally overwhelming for an autist. But a detached generalist would also avoid society and novelty, because they are likely to be underwhelming, just a trivial distraction relative to his own thoughts.

Second, it is far easier to recognize exceptional abilities of a specialist than those of a generalist. We all share lower generality levels, - that's where the data comes from, leaving less to interpretation. But effective generality of top association cortices definitely differs among individuals. It takes an equally competent generalist to evaluate quality of generalizations. Which is why the work in psycho-social sciences, and especially in philosophy, is so vastly inferior to that in relatively lossless hard sciences.
So, an autistic genius is far more likely to gain recognition than “anti-autistic” one.

Needless to say, this write-up is motivated by introspection.

Cultivating top-down focus

Theoretical work is driven by sustained top-down vs. bottom-up attention. The top is long-term priorities, derived from broad generalizations, and the bottom is current experience. Evolution always neglected long-term: people didn’t survive very long unless they paid close attention to their immediate environment. Modern society is drastically more secure but our attention span barely budged. In fact, it’s getting worse for the majority, - they just elected ADHD-addled clown-in-chief.

“Thinking is to people as swimming is to cats: they can do it but prefer not to” Daniel Kahneman.
“I have no special talent, I am only passionately curious” Albert Einstein.

A lot of people could become a world-changing genius, if they spent 10 years of their youth fully focused on important problem. But that must come at the cost of “life“: unthinkable for hand-to-mouth hunter-gatherers that we evolved to be. I first decided on my top priority in the adolescence. But maintaining effective working focus on these abstractions, vs. “real” distractions, was far more difficult. Over the years, I majorly improved my concentration via following techniques:

Practice, externalization, formalization

Anything profound is initially boring, curiosity is cultivated by incrementally deep studying or design. Such practice forms redundant representations, differentiated by their context to explore alternative scenarios. Which helps to maintain parallel subconsciously searching threads, even when your consciousness is distracted. They also fill-up memory and starve unrelated subjects out of resources. This is very important: irrelevant memories keep competing for our attention until they faint out.

Obsessed with externalities, we need a conducive environment to facilitate this virtuous cycle of practice. Basic working environment is a notepad or a computer screen, so we need to fill them with a well designed write-up of the subject. Quite obviously, the brain has plenty of memory for a few pages of text, scarce resource here is our attention. Writing down thoughts turns them into a sensory feedback, which is far more effective at maintaining conscious attention than “internal” abstractions. Also helps motor feedback: verbalizing, writing by hand, semi-random editing or re-arranging text or code.

Another focus aid is formalization: developing subject-specific terminology, abbreviations, symbols. This is critical for building concise and comprehensive model of a subject, structured to reverberate within working memory. Such model must be incrementally refined and extended, nothing worthwhile can be done on the first try. Refining means resolving internal contradictions and eliminating overlaps, vs. simply accumulating related aspects and perspectives.

Stimulation and avoiding distractions

One of the most important “environment and stimulants” is people we deal with. Your listener's attention (if credible) stimulates yours, even when he doesn't contribute anything. To facilitate this, universities and companies impose face-to-face contact among colleagues. But relevance of these institutions themselves depends on societal consumer competence, which is sorely lacking on higher-generality subjects. Luckily, social stimulation can be replaced by writing or talking to oneself.

Beside relevant stimulation (be honest about “relevant“), one must block the irrelevant one. Real-life socializing is almost always meaningless, compared to impersonal reading and writing. People are desperate to join a group and rejection feels like a death sentence. But if there is no sufficiently relevant group, any socializing is huge waste of mindspace. However miserable social isolation feels at first, you will get used to it. For a broadly stimulated brain with a clear purpose, attention is a zero-sum game.

Such broad stimulation is easy: tea, cocoa, and low-dose nicotine (patch) do it for me. As distinct from smoking, nicotine itself is pretty benign, see Gwern. For less intrinsically stimulated, there are ritalin, adderall, deprenyl, modafinil, etc. Another potent stimulant is exercise while working. I work on a treadmill desk and alternate between walking, standing, and sitting, all while remaining in front of projector screen (which is more “immersive” and distant than a monitor: it doesn’t jump in the eyes as much when you walk). Definitely recommend, it probably added ~2 hours of work per day.

Beside socializing, the worst attention hog now is the web, and my solution is rationing. Unless there is something urgent or work-related (unlikely), I only connect for ~2 hours a day. Sticking to it was a challenge, I have to use “Freedom“ to keep myself honest. Sounds trivial, but it made a huge difference to my concentration. You may even want to lock yourself in for a fixed time, just put the key in kitchen safe. And don’t even start me about current cellphone plague, - never wanted one.

Direct self- conditioning

But even more insidious, at least for a generalist like me, is internal distractions: wandering thoughts.
There is a low-tech solution: thought conditioning, and it may be the most effective suggestion here.

Aversive conditioning is simple and old-fashioned: just slap your face when distracting. But you must be serious, it’s a war with your own reptilian brain. Slapping must become a reflexive habit, something you no longer decide on. If you can’t bring yourself to slapping, there is a rubber band or Pavlok. Irrelevant subjects will acquire unpleasant associations and you avoid thinking about them. After that it’s enough to simply monitor your thoughts for distractions and repeat a mantra: “it doesn’t matter”.

Positive conditioning of relevant thoughts is far more difficult: they are fluid and don’t associate with specific cues for conventional reinforcement. Less specific but still helpful is reserving specific desk, computer, and time only for work. Also useful is neurofeedback, article. I currently use a very simple version: every day, I write down the number of hours spent on work, multiplied by their effectiveness relative to average effectiveness of recent working hours. It does help a bit.

Advanced neurofeedback may become possible by transcranial imaging to visualize cortical activity. Eventually, we will directly stimulate cortical areas that represent relevant subjects. Stimulation by red and infrared light is already feasible, but very imprecise. Overall, top-down attention seems to be local to left dorsolateral prefrontal cortex: the highest level of task-specific generalization. But symbolic and mathematical processing is more specific to left inferior parietal cortex, especially angular gyrus.

Deliberate control over the subject of attention will be the most profound revolution yet: it will change what we want out of life. But waiting for the technology will leave you hopelessly behind those who do it old-fashioned way. Of course, most of us dressed-up apes don’t care, - there are bananas to be picked.


I am what I do: strictly functional design of cognitive process: www.cognitivealgorithm.info. Good concentration is relatively recent, but I’ve been working on a theory of intelligence most of my life, anything else is trivial by comparison. I work on my own because nothing that I’ve come across is coherent enough. And also because I can, recently financially and always emotionally.

The older I get (chronologically 55, biologically < 40), the more it hits me just how abnormal I am: integrity nazi, insensitive to conditioning, driven mostly by value-free curiosity. Value-free doesn’t mean indiscriminate. Rather, such curiosity is selective for subjects of greater predictive value, which leads to overriding interest in the process of prediction. Lacking nearly universal addiction to social support and immediate experimental confirmation, I am free to follow intellectual imperatives.

My first interests were geography and history, then physical sciences and biology. I majored in social science because modern society has deeper structured complexity than any established subject. But that field lacks in academic integrity. And the most important part of social progress is discovery and invention, which is basically a composite of individual human learning. So, I switched to studying the latter, about a lifetime ago, both for intellectual depth and for potential impact on the world.

That doesn’t mean psychology and neuroscience. I got into both more recently (Cognitive Focus), mostly for insight into our deficiencies. To understand intrinsic function of cognition, vs. ton of other things in human mind, sustained introspective generalization is far superior to mere observation. Having started with the former, I find almost everything about brain and neurons to be grossly sub-optimal. Which is not surprising for a product of blind evolution and severe biological constraints.

Formalizing cognition is also the only legitimate problem in philosophy, which was my interest in a while. But philosophers are too busy bullshitting college freshmen and other clueless highbrows, they don’t seem to have much time or motivation left for real work.

Math wasn’t my interest bacause it is primarily deductive. I start with induction to define intelligence, then derive operations from that definition. Which isn’t terribly controversial, but consistent derivation renders almost all math that I know irrelevant. People like math for its clarity and certainty, at least initially. But there is a direct tradeoff between certainty and complexity of the subject, which don’t get more complex than human intelligence. I picked complexity and speculation first, certainty had to wait.

Computer science designs and implements algorithms, but one must formally define the purpose first. Which wasn’t done for cognition, and it took me a lot of work before I could start coding. Cognitive algorithm must be designed with incremental complexity. And even relatively simple core algorithm should learn increasingly complex computational short-cuts (math or CS) on its own, just like we do. There aren’t a lot genes to encode our algorithm, and calculus certainly isn’t innate in humans.

Also missing here is biography. Mine is a life of mind, the rest is a distraction (I had plenty of that).
Throughout history, working alone on my problem would be of no consequence. Things changed: publish on the net now and Google will find you with the right keywords, status and credentials be damned. And convincing people is not even necessary anymore, all you really need is working code.

Still, a constructive conversation would be nice for now, seeing that I am short of the former.
Anything I write is meant to be substantially original, thus speculative. But the subject is king.

I never stop questioning assumptions and all my posts are a work in progress.


Consciousness as an artifact of brain-to-body bottleneck

Conscious attention, implemented as working memory, is focused on a single or few items at a time. Intrinsically, cognition doesn't need such central focus: our brains are massively parallel. Ideally, the pool of neurons should be allocated to many subjects of interest by something like a market, according to predictive value of these subjects. As it probably happens in unconscious or intuitive cognition.

But the brain evolved to guide a single body, which in most respects can only do one thing at a time. Hence the artifact of central consciousness. Beside disrupting smooth allocation of cognitive resources, this bottleneck obviously favors somatic concerns, which constitute lower forms of human motivation.

Such sequential focus is likely implemented by thalamus, which serves as a central switchboard for the brain. It seems to invoke consciousness by generating higher-frequency brainwaves, especially gamma waves, which bind together areas related to working memory (brief overview: The Missing Moment by Robert Pollack, pp. 46-56, or a far more involved treatment: Rhythms of the Brain by Gyorgy Buzsaki.

My personal opinion is that main function of thalamus is to mediate competition between brain areas, particularly via TRN. From a networking perspective, it’s a lot cheaper to do this in a central body, as opposed to each region or column directly inhibiting all others. In fact, Sherman and Guillery suggest that a thalamus could be viewed as a consolidated “7th layer” of neocortex (“Exploring the Thalamus”).

Primary sensory and motor cortices seem to be overrepresented in thalamus, - pulvinar nuclei alone comprise 40% of it. Better thalamic connectivity of primary cortices should enhance search for relevant associations in other brain areas. This is introspectively plausible: most of working memory is what we currently visualize, vocalize, or actualize. I think we enhance our focus on general concepts in the same fashion: by generating fake experiences of subvocalizing, subvisualizing, and subactualizing.

This is probably mediated by feedback to primary cortices, underutilized during sensory “vacations”.
So, primary cortices are frequently “hijacked” by higher areas to simulate (interactively project) their generalized concepts. Such “primarization” is particularly important for mathematicians, engineers, and scientists, who work on imaginary constructs and often think visually rather than verbally.

However, primary cortices are unnaturally “low” for such subjects. Because “elevation” is wrong, these projections often become false memories, confabulations, hallucinations, - substitution of imagination (feedback) for actual experience (feedforward). This may be a factor in developing schizophrenia, in which imagination seems to get out of control. It is suggestive that default mode network, and specifically left posterior cingulate cortex, were found to be unusually active in schizophrenics.

Such confusion should be more likely in habitually hijacked primary areas, which may become less attached to their respective senses. More general concepts are represented by higher association cortices. The highest area seems to be dorsolateral PFC: developmentally the last to myelinate and the most involved in executive function. Primarization could be mediated by short-cuts to lower levels of cortical hierarchy, such as arcuate fasciculus and spindle neurons, with their far-reaching axons.

Basal ganglia: subcortical modulator of attention.

While thalamus seems to be a relatively neutral mediator of competition for conscious attention, basal ganglia implements conditioning, which actively directs focus. Phasic dopamine in basal ganglia also indicates “reward prediction error”, and variation in sensitivity to dopamine is a risk factor for ADHD.

For example, ADHD is correlated with 7-repeat allele of DRD4, which accelerates reuptake. Even more important might be variation in COMT gene: Met 158 allele, which degrades postsynaptic dopamine 4x slower than Val 158 allele, is associated with better working memory, but slower attention switching. Basically, it enhances top-down or goal-directed attention vs. bottom-up or novelty-oriented attention. ADHD is treated by norepinephrine and dopamine agonists or reuptake inhibitors, such as Bupropion.

This differs between hemispheres: "To advance our understanding of ADHD and medication effects we draw upon the evidence for (1) a neurotransmitter imbalance between norepinephrine and dopamine in attention-deficit hyperactivity disorder and (2) an asymmetric neural control system that links the dopaminergic pathways to left hemispheric processing and links the noradrenergic pathways to right hemispheric processing. It appears that attention-deficit hyperactivity disorder may involve a bi-hemispheric dysfunction characterized by reduced dopaminergic and excessive noradrenergic functioning. In turn, favorable medication effects may be mediated by restoration in neurotransmitter balance and by increased control over the allocation of attentional resources between hemispheres".

On a cellular level, temporal attention span is inversely proportional to the "decay rate" for stimuli propagating from primary into association areas of neocortex. Passive decay is caused by charge dissipation across neuronal membrane and reuptake of excitatory neurotransmitters at the synapses. Such decay promotes relatively novel stimuli. On the other hand, active suppression by neurons that represent competing stimuli, via inhibitory interneurons and neurotransmitters, should promote relatively recurrent or concurrent stimuli. Longer term, slower passive decay would correspond to longer connections and competition among more distant and persistent stimuli.

Other factors affecting stimuli decay rate are axonal straightness and myelination, structural trade-offs within cortical minicolumns and thalamus (see “Cortical Trade-Offs“ post), and so on. A developmental possibility is that high levels of cortisol / low levels of serotonin increase the levels of phasic dopamine, which in turn accelerates dopamine reuptake. ADHD sufferers have fewer dopamine autoreceptors, leading to greater fluctuations in its levels and increased novelty seeking to keep the cortex busy.

The degree of preference for novelty in the immediate environment also depends on recent intensity of value-loaded stimuli, modulated by our subjective sensitivity to the latter. Sensitivity is increased by deprivation (vs addiction) for positive stimuli, and security (vs vulnerability) for the negative ones. Particularly during formative years, attention span can be increased by broad intellectual exposure, if combined with weak visceral pressures and temptations.