Some hints from my approach to the problem

I have to admit that, to date, my suggestion of a kind of  'complementarity principle' among theories (or between two main theoretical categories) is just a little more than wishful thinking. Proponents of general or music-specific taxonomic theories, in fact, seem to be convinced that theories based on conventions (on the one side) and theories based on prototypes, family resemblances and schemata (on the other side) do actually oppose and exclude one another. There seems to be something in common between such diverse theories as George Lakoff's (and other cognitive scientists') theory on prototypes and 'basic level recognition' (Lakoff 1987), Pierre Bourdieu's habitus ('principles which generate and organize practices and representations that can be objectively adapted to their outcomes without presupposing conscious aiming at ends or an express mastery of the operations necessary in order to attain them', Bourdieu 1992, 53), and Daniel Levitin's neo-Kantian 'schemas' (or schemata, Levitin 2008): they all strongly oppose rules and property-based definitions (to the point, in Levitin's case, of ridiculing them) [1], and are in favour of explanations based ultimately on neural phenomena, on the formation of cognitive and behavioural habits hardwired in human bodies. In terms that those authors would probably not accept or have accepted, this means bringing such phenomena well below the domain of semiotics. At least in Bourdieu's case, the dismissal of rules was part of an anti-structuralist stance (aimed especially at French structuralist anthropology, i.e. Lévi-Strauss) [2]; and in Levitin's case, with his grotesque description of property-based definitions, I would claim a misunderstanding of semiotics' basic principles (such as coding). In Lakoff's case, moreover, a thorough commentary, as well as a revision of semiotic theories, has been provided by Umberto Eco in his Kant and the Platypus (1999). In all cases, even in Eco's theory of 'cognitive types', a kind of black box is invoked: all we seem to be allowed to know is that there is something functioning at the interface between perception and cognition, and that it influences our behaviour. It is suggested that our neural system, with specific associations of neurons 'firing', is at the physical base of such a black box. It is also implied that we 'learn' how to recognize and behave (Bourdieu is the one who worked more on this issue, though he admitted that it was a very complex matter), but it is unclear what learning processes are involved, and whether cognition plays any part in it.

Most of these explanations (whose authors have been particularly keen in reproaching other theories for the same reason) tend to be 'static' in one way or another. It is clear that we are not born with a schema to recognize heavy metal (Levitin's example), and that at some point we can form one, based on family resemblances or prototypes. But how does it happen? Is it possible to improve our schemata so that we recognize heavy metal more promptly? Is it possible to forget them? And how was heavy metal first recognized (as there must have been someone who had this experience before any other)? Moreover, does the expression 'heavy metal' exist only to allow a listener to recognize specific pieces of music? What about guitarists who want to learn to play like one of the prototypical heavy metal guitarists: must they develop specific schemata? Similar to or different from an average listener? What about the manager of a stadium where a heavy metal gig is planned: will their considerations about seat placement, security, and so on, be part of a distinctive schema? Or of a habitus? Or are such considerations in the domain of cognition, of 'conscious aiming at ends'?

How we recognize things is very important, as is how we make sense of such recognition. Unfortunately, albeit all the more interesting, concepts (cultural units) such as genre extend over both domains. Humans use genre – genre names, especially – to talk about music. For many, this is one or maybe the only way to verbalize their own experience of music. Some people are definitely able to talk about a genre without even being able to recognize a piece of music that 'belongs' to it. Genres are also about beliefs and lies: as such, they can be an object of semiotic study. But can semiotics make sense of the diachronic aspects of genre? Isn't a semiotic approach to genre (as a cultural unit, or 'a semantic unit inserted into a system', see Eco 1976, 67) 'static', as a number of scholars maintain (see, for example, Santoro 2010, 27-28)?

Often what emerges from such criticism is a distorted image of semiotic concepts: not distorted enough to make the destructive rhetoric clear even to those who are not familiar with the discipline, yet just right to accommodate common sense prejudices against a field of study that hasn't been fashionable in the past decades. Semiotic codes – according to that rhetoric – are like commandments, they are agreed upon deliberately by fixed communities and are there for good. Codes (or norms, or conventions) are binding, they are not negotiable, no conflicts about their meaning are possible. It is implied by such criticisms that semioticians (and their associates: linguists and language philosophers) don't know how to handle collective processes: only sociologists know what a socially shared norm is and how it works. I would argue that this is not so. Yet, I do not imply by this that sociological approaches to genre are wrong.

Another advantage of semio-linguistic approaches to genre, opening new, interesting perspectives to the otherwise 'mysterious' act of initial codification, is pointed out by the philosophical study of convention made by David K. Lewis (1941–2001) in 1969. Aimed at a rigorous definition of one of the fundamental processes that make language possible, the study is based on a class of games (coordination games) that were overlooked by game theorists at the time. With examples of growing complexity, Lewis shows how conventions can be established without ever stipulating them explicitly. Therefore, the conventional nature of language, and of any code (or set of norms) established conventionally, doesn't imply that at any point there be a clear agreement, or that the involved parties (or community) shall declare the acceptance of the convention(s). Of course, the existence of a convention can be recognized and even officially acknowledged, but recognition happens when the convention is already in place, and is based on the actual functioning of the convention itself. There is no black box, though. Here is the first rough definition of convention given by Lewis:

A regularity R in the behaviour of members of a population P when they are agents in a recurrent situation S is a convention if and only if, in any instance of S among members of P,
(1) everyone conforms to R;
(2) everyone expects everyone else to conform to R;
(3) everyone prefers to conform to R on the condition that the others do, since S is a coordination problem and uniformity to R is a proper coordination equilibrium in S (Lewis 2002, 42).

Given the other, more refined definitions in the book, Lewis himself warns against the risk of hiding 'the concept beneath its refinements' (Ibid.). The rough definition is enough for our purpose [3], especially if we complement it with an extract from the book's 'Foreword', by Willard Van Orman Quine:

We have before us a study, both lucid and imaginative, both amusing and meticulous, in which Lewis undertakes to render the notion of convention independent of any fact or fiction of convening. […] in the course of the book the reader comes to appreciate convention, not analyticity, as a key concept in the philosophy of language. [4]

The final comment, made by the quintessential analytic philosopher Van Orman Quine about one of his pupils, is meaningful. But are we allowed to extrapolate Lewis's ideas from the philosophy of language to genre theory? Yes, without any doubt, I would say. In fact, Lewis does the opposite: he attaches to the philosophy of language a theory based on the observation of ordinary recurrent situations, such as: who should recall first if a telephone call is unexpectedly cut off? Such coordination problems are essential in music practice. Although Lewis's examples do not cover music activities, one of the earliest examples in the book could easily be translated into musical terms:

An example from Hume's Treatise of Human Nature: Suppose you and I are rowing a boat together. If we row in rhythm, the boat goes smoothly forward; otherwise the boat goes slowly and erratically, we waste effort, and we risk hitting things. We are always choosing whether to row faster or slower; it matters little to either of us at what rate we row, provided we row in rhythm. So each is constantly adjusting his rate to match the rate he expects the other to maintain (Lewis 2002, 5).

In a recurring music event, maximizing the pleasure of each of the participants, or ensuring everyone has the best understanding of what's going on, or minimizing the amount of information that must be processed to obtain pleasure and understanding may be perceived as coordination problems; conforming to genre conventions, to use Lewis's words, is 'a proper coordination equilibrium' in that recurring event.

What is especially fascinating in Lewis's theory is that it makes sense of the 'subconscious' aspects of categorization and behaviour (see Bourdieu's 'practices and representations that can be objectively adapted to their outcomes without presupposing conscious aiming at ends'), but without making use of black boxes. Or, instead, looking into them.

In order to make a comparison let us return to some other theories. According to prototype-based theories, genres are defined by resemblance; in my opinion, it may well be that crucial resemblances are established by convention. Moreover, I would argue that although prototypes and family resemblances are fundamental in the recognition of stylistic features by individuals, what makes them meaningful in collective music practice is that they are conventionally formalized; that is, that a community socializes such recognition and takes full advantage of it. Of course an individual can learn a schema to distinguish 'heavy metal' as the genre where music resembles pieces by Led Zeppelin, the 'quintessential heavy metal band' (Levitin 2008, 142) [5].

A 'quintessential heavy metal singer'?

A 'quintessential heavy metal guitarist'?

But, which piece(s)? And who developed that schema first? And how is it transmitted to a whole community? And is that resemblance enough to define all aspects of the genre? Levitin himself acknowledges that prototypes and family resemblances are not satisfactory explanations for the many nuances of music competence, though he doesn't seem worried by the circularity of arguments such as 'we say that something is heavy metal if it resembles heavy metal'. Looking for diachronicity on the basis of a theory that is exemplified by such statements can be a hopeless task.

On the other hand, convention (as described by Lewis) is a process situated in time: there is a time before a convention takes place, and a time after; it is also possible to describe the process by which a convention ceases to exist (when, for example, it is recognized that a certain regularity in the behaviour of the population is not a coordination equilibrium anymore, and members of the population stop conforming to it). Looking for a theory that helps to explain how genres are born, we also need a theory that accounts for 'dead' genres.