CorporisPublica:Categorization

This page contains guidance on the proper use of the categorization function in. For information on the mechanics of the function, category syntax, etc., see Help:Category. For quick answers, see the Categorization FAQ. For the category system itself, see Category:Contents. For proposals to delete, merge, or rename categories, follow the instructions at Categories for discussion. Please use it before undertaking any complicated re-categorization of existing categories or mass creation of new categories.

Overview
The central goal of the category system is to provide navigational links to all pages in a hierarchy of categories which readers, knowing essential - defining - characteristics of a topic, can browse and quickly find sets of pages on topics that are defined by those characteristics.

Categories are not the only means of enabling users to browse sets of related pages. Other tools which may be used instead of or alongside categories in particular instances include lists and navigation boxes. For a comparison of these techniques, see Categories, lists and navigation templates.

Creating category pages
To create a category, first add an article to the category. It will appear as a red link in the category list at the bottom of the page. Click on the red link, and add the new category to an appropriate parent category.

Sometimes, a common-sense guess based on the title of the category isn't enough to figure out whether a page should be listed in the category. So, rather than leave the text of a category page empty (containing only parent category declarations), it is helpful – to both readers and editors – to include a description of the category, indicating what pages it contains, how they should be subcategorized, and so on.

In such cases, the desired contents of the category should be described on the category page, similar to how the list selection criteria are described in a stand-alone list. The category description should make direct statements about the criteria by which pages should be selected for inclusion in the category. This description, not the category's name, defines the proper content of the category. Do not leave future editors to guess about what or who should be included from the title of the category. Even if the selection criteria might seem obvious to you, an explicit standard is helpful to others, especially if they are less familiar with the subject.

The description can also contain links to other pages, in particular to other related categories which do not appear directly as subcategories or parent categories, and to "sister categories" on other projects, such as Commons. Another technique that can be used is described at CorporisPublica:Classification.

Various templates have been developed to make it easier to produce category descriptions; see Category namespace templates. There are hatnote templates including Cat main and Category see also; others are listed at Template:Hatnote templates documentation.

A maximum of 200 category entries are displayed per screen. To make navigating large categories easier, a table of contents can be used on the category page. The following templates are some of the ways of doing this:
 * Category TOC – adds a complete table of contents (Top, 0–9, A–Z)
 * Large category TOC 2 – adds a complete table of contents with five subdivisions for each letter (Aa Ae Aj Ao At)
 * Large category TOC – adds a complete table of contents with twenty-six subdivisions for each letter (Aa ... Az)

Subcategories are split alphabetically along with the articles, which means that the initial screen of a split category may not include all its subcategories. To make all subcategories display on each screen, add a category tree to the text of the category page, as described at the help page under Displaying category trees and page counts.

Interlanguage links work on category pages just as they do for articles, and can be used to link to corresponding categories on other language Wikipedias.

Categorizing pages


Every page should belong to at least one category. (However, there is no need to categorize talk pages, redirects, or user pages, though these may be placed in categories where appropriate.) In addition, each categorized page should be placed in all of the most specific categories to which it logically belongs. This means that if a page belongs to a subcategory of C (or a subcategory of a subcategory of C, and so on) then it is not normally placed directly into C. For exceptions to this rule, see Non-diffusing subcategories below.

While it should typically be clear from the name of an existing category which pages it should contain, the text of the category page may sometimes provide additional information on potential category contents.

One way to determine if suitable categories already exist for a particular page is to check the categories of pages concerning similar or related topics. Another way is to search existing category names as described here (top of page).

Note: If you're not sure where to categorise a particular page, add the uncategorized template to it—other editors (such as those monitoring CorporisPublica:WikiProject Categories/uncategorized) will help find appropriate categories for it.

Articles
Categorization of articles must be verifiable. It should be clear from verifiable information in the article why it was placed in each of its categories. Use the Category unsourced template if you find an article in a category that is not shown by sources to be appropriate, or the Category relevant? template if the article gives no clear indication for inclusion in a category.

Categorization must also maintain a neutral point of view. Categorizations appear on article pages without annotations or referencing to justify or explain their addition; editors should be conscious of the need to maintain a neutral point of view when creating categories or adding them to articles. Categorizations should generally be uncontroversial; if the category's topic is likely to spark controversy, then a list article (which can be annotated and referenced) is probably more appropriate.

A central concept used in categorising articles is that of the defining characteristics of a subject of the article. A defining characteristic is one that reliable sources commonly and consistently define the subject as having—such as nationality or notable profession (in the case of people), type of location or region (in the case of places), etc. For example, here: "Caravaggio, an Italian artist of the Baroque movement ...", Italian, artist, and Baroque may all be considered to be defining characteristics of the subject Caravaggio. A category embodies one or more defining characteristic—how this is achieved in practice is described in the following sections.

Particular considerations for categorizing articles:
 * By convention, category declarations are placed at the end of the wikitext, but before any stub templates (which themselves transclude categories) and interlanguage links.
 * The order in which categories are placed on a page is not governed by any single rule (for example, it does not need to be alphabetical, although partially alphabetical ordering can sometimes be helpful). Normally the most essential, significant categories appear first.
 * An article should never be left with a non-existent (redlinked) category on it. Either the category should be created, or else the link should be removed or changed to a category that does exist.
 * Categorization should not be made by the type of an article. A biographical article about a specific person, for example, does not belong in Category:Biography.
 * Articles on fictional subjects should not be categorized in a manner that confuses them with real subjects.

Files/images
Category tags can be added to file/image pages of files that have been uploaded to. When categorized, files are not included in the count of articles in the category, but are displayed in a separate section with a thumbnail and the name for each. A category can mix articles and images, or a separate file/image category can be created. A file category is typically a subcategory of the general category about the same subject, and a subcategory of the wider category for files, Category:CorporisPublica files. To categorize a new file when uploading, simply add the category tag to the upload summary.

Freely licenced files may also be uploaded to, and categorized on, Wikimedia Commons. This can be done instead of, or in addition to, uploading and categorizing on. Most freely licenced files will eventually be copied or moved from to Commons, with a mirror page remaining on. (For an example of one such mirror page, see here). Categories should not be added to these mirror pages, because doing so creates a new  page that is subject to speedy deletion. Exceptions to this principle are made for mirror pages of images that are nominated as featured pictures and for those that appear on the Main Page in the Did You Know? column.

Images that are used in that are non-free or fair use should not appear as thumbnail images in categories. To prevent the thumbnail preview of images from appearing in a category, __NOGALLERY__ should be added to the text of the category. In such cases, the file will still appear in the category, but the actual image preview will not.

CorporisPublica administrative categories
A distinction is made between two types of category:
 * Administrative categories, intended for use by editors or by automated tools, based on features of the current state of articles, or used to categorize non-article pages.
 * Content categories, intended as part of the encyclopedia, to help readers find articles, based on features of the subjects of those articles;

Administrative categories include stub categories (generally produced by stub templates), maintenance categories (often produced by tag templates such as cleanup and fact, and used for maintenance projects), WikiProject and assessment categories, and categories of pages in non-article namespaces.

Article pages should be kept out of administrative categories if possible. For example, the templates that generate WikiProject and assessment categories should be placed on talk pages, not on the articles themselves. If it is unavoidable that a administration category appears on article pages (usually because it is generated by a maintenance tag that is placed on articles), then in most cases it should be made a hidden category, as described under Hiding categories below.

There are separate administrative categories for different kinds of non-article pages, such as template categories, disambiguation page categories, project page categories etc.

User pages
User pages are not articles, and thus do not belong in content categories such as Living people or Biologists. They can however be placed in user categories – subcategories of Category:Wikipedians, such as Category:Wikipedian biologists – which assist collaboration between users.

Similarly, user subpages that are draft versions of articles should be kept out of content categories, but are permitted in non-content or project categories, like Category:User essays. If you copy an article from mainspace to userspace and it already contains categories, remove them or comment them out. Restore the categories when you move the draft back into article space.

At Database reports/Polluted categories, a list of affected categories is maintained.

Categorization using templates
Many templates include category declarations in their transcludable text, for the purpose of placing the pages containing those templates into specific categories. This technique is very commonly used for populating certain kinds of administration categories, including stub categories and maintenance categories.

However, it is recommended that articles should not be placed in ordinary content categories using templates in this way. There are many reasons for this: editors cannot see the category in the wikitext; removing or restructuring the category is made more difficult (partly because automated processes will not work); inappropriate articles and non-article pages may get added to the category; sort keys may be unavailable to be customised per category; ordering of categories on the page is less controllable; and the "incategory" search term will not find such pages.

When templates are used to populate administration categories, ensure that the code cannot generate nonsensical or non-existent categories, particularly when the category name depends on a parameter. Also, see Category suppression for ways of keeping inappropriate pages out of template-generated categories.

Category declarations in templates often use as the sort key, particularly if they are designed to be placed on talk pages, as this suppresses the Talk: prefix. Note that this overrides any DEFAULTSORT defined on the page.

Hiding categories
In cases where, for technical reasons, administration categories appear directly on articles rather than talk pages, they should be made into hidden categories, so that they are not displayed to readers. This rule does not apply to stub categories or "uncategorized article" categories – these types are not hidden.

To hide a category, add the template to the category page (the template uses the magic word  ). This also places the page in Category:Hidden categories.

A logged-in user may elect to view all hidden categories, by checking "Show hidden categories" on the "Appearance" tab of My Preferences. Notice that "hidden" parent categories are never in fact hidden on category pages (although they are listed separately).

Redirected categories
Because of software limitations, ordinary (hard) redirects should not be used with category pages. If a category name needs to be redirected to another, use the Category redirect template to create a soft redirect. A bot traverses categories redirected in this manner, moving articles out of the redirected category into the target category. (See Template talk:Category redirect.)

Sort keys
Sort keys are sometimes needed to produce a correct ordering of member pages and subcategories on the category page. For the mechanics, see Sort order on the help page.

Because the software uses an imperfect computer sorting rather than true alphabetical ordering (see details), it is important that some sort keys be adjusted consistently. Until recently, the biggest needed adjustment was to consistently capitalize the first letter of each word and make lower case all other letters. However, the software has been improved and the largest remaining adjustment required is the replacement of non-English accented characters, such as "ź" with English counterparts, e.g. "z".

Categories of people are usually sorted by last name rather than first name, so "surname, forename" sort keys are used (as in "Washington, George"). For more information, see Ordering names in a category in the people categorization guideline.

Entries containing modified letters should be sorted as if the letters were unmodified (for example, Łódź has the sort key "Lodz").

Other sort key considerations:
 * Leading articles—a, an, and the—are one of the most common reasons for using sortkeys, moving the article to the end of the key, as in . Please also apply these sort keys to deliberate misspellings of these words—e.g. "da" or "tha" for "the", as well as foreign language articles, such as "el" or "der" (but beware of non-articles that have the same spelling, e.g. that translate as "at" or "one"). However, leading articles in foreign-language-derived names which are no longer translated in English are not subject to this rule, e.g. the sortkey for El Paso should be.
 * Spell out abbreviations and characters used in place of words so that they can be found easily in categories. E.g. the sortkey for Mr. Bean should be and Dungeons & Dragons should be sorted.
 * Entries containing numbers sometimes need special sort keys to ensure numerical rather than alphabetical ordering (for example, 19 and 103 come before 2 in alphabetical order, and IX comes before V). So Haydn's 13th symphony might have the sort key "Symphony 013", the zero ensuring that it is listed before symphonies 100–108; Pope John IX might have a sort key "John 09". It is important to stick to the same system for all similar entries in a given category.
 * Systematic sort keys are also used in other categories where the logical sort order is not alphabetical (for example, individual month articles in year categories such as Category:2004 use sort keys like "*2004-04" for April). Again, such systems must be used consistently within a category.
 * In some categories, sort keys are used to exclude prefixes that are common to all or many of the entries, or are considered unimportant (such as "List of" or "The"). For example, in Category:2004 the page 2004 in film would have the sort key "Film", and in Category:2004 in Canada the page 2004 Canadian federal budget would have the sort key "Federal Budget".
 * Use a space as the sort key for a key article for the category. (Note: If the key article should not be a member, simply edit the category text itself to add it, perhaps using Cat main.)
 * Use other sort keys beginning with a space (or an asterisk or a plus sign) for any "List of ..." and other pages that should appear after the key article and before the main alphabetical listings. The same technique is sometimes used to bring particular subcategories to the start of the list.
 * To place entries after the main alphabetical list, use sort keys beginning with tilde ("~"). Several Greek letters are also used for specific purposes. "Σ" (sigma) is used to place stub categories at the end of subcategory lists ("µ" was previously used but the capital version "Μ" was confusing). "β" (beta, displays as "Β") is for CorporisPublica books. "ι" (iota, displays as "Ι") is for images. "ρ" (rho, displays as "Ρ") is for portals. "τ" (tau, displays as "Τ") is for templates. "ω" (omega, displays as "Ω") is for WikiProjects. Similar to the handling of Latin letters, if the sort key is a lower case Greek letter then the capital Greek letter will be displayed in headings on category pages. "β" will appear beneath "Β"; "ι" beneath "Ι"; "ρ" beneath "Ρ"; "τ" beneath "Τ"; "ω" beneath "Ω"; etc. Several of these resemble Latin letters B, I, P etc., but will sort after Z.
 * If a page is to be given the same sort key in all or several of its categories, the  magic word can be used. Per CP:FOOTERS, this is placed just before the list of category declarations. Default sort keys are sometimes defined even where they do not seem necessary—when they are the same as the page name, for example—in order to prevent other editors or automated tools from trying to infer a different default.

Category tree organization


Categories are organized as overlapping “trees”, formed by creating links between inter-related categories. Any category may contain (or “branch into”) subcategories, and it is possible for a category to be a subcategory of more than one “parent” category. (A is said to be a parent category of B when B is a subcategory of A).

There is one top-level category, Category:Contents. All other categories are found below this. Hence every category apart from this top one must be a subcategory of at least one other category.

There are two main kinds of category:
 * Topic categories are named after a topic (usually sharing a name with the article on that topic). For example, Category:France contains articles relating to the topic France.
 * Set categories are named after a class (usually in the plural). For example, Category:Cities in France contains articles whose subjects are cities in France.

Sometimes, for convenience, the two types can be combined, to create a set-and-topic category (such as Category:Voivodeships of Poland, which contains articles about particular voivodeships as well as articles relating to voivodeships in general).

Subcategorization
If logical membership of one category implies logical membership of a second, then the first category should be made a subcategory (directly or indirectly) of the second. For example, Cities in France is a subcategory of Populated places in France, which in turn is a subcategory of Geography of France.

Many subcategories have two or more parent categories. For example, Category:British writers should be in both Category:Writers by nationality and Category:British people by occupation. When making one category a subcategory of another, ensure that the members of the first really can be expected (with possibly a few exceptions) to belong to the second also. Category chains formed by parent-child relationships should never form closed loops. If two categories are closely related but are not in a subset relation, then links between them can be included in the text of the category pages.

A page or category should rarely be placed in both a category and a subcategory or parent category (supercategory) of that category (however, see directly below). For example, the article "Paris" need only be placed in "Category:Cities in France", not in both "Category:Cities in France" and "Category:Populated places in France". Since the first category is in the second category, readers are already given the information that Paris is a populated place in France by it being a city in France.

Diffusing large categories
Although there is no limit on the size of categories, a large category will often be broken down ("diffused") into smaller, more specific subcategories. For example, Category:Rivers of Europe is broken down by country into the subcategories Rivers of Albania, Rivers of Andorra, etc.

A category may be diffused using several coexisting schemes; for example, Category:Albums is broken down by artist, by date, by genre etc. Metacategories may be created as ways of organizing schemes of subcategories. For example, the subcategories called "Artistname albums" are not placed directly into Category:Albums, but into the metacategory Category:Albums by artist, which itself appears in Category:Albums.

It is possible for a category to be only partially diffused – some members are placed in subcategories, while others remain in the main category.

Information about how a category is diffused may be given on the category page. Categories which are intended to be fully broken down into subcategories can be marked with the catdiffuse template, which indicates that any pages which editors might add to the main category should be moved to the appropriate subcategories when sufficient information is available. (If the proper subcategory for an article does not exist yet, either create the subcategory or leave the article in the parent category for the time being.)

To suggest that a category is so large that it ought to be diffused into subcategories, you can add the verylarge template to the category page.

Non-diffusing subcategories
Not all subcategories serve the "diffusion" function described above; some are simply subsets which have some special characteristic of interest, such as Best Actor Academy Award winners as a subcategory of Film actors, Toll bridges in New York City as a subcategory of Bridges in New York City, and Musical films as a subcategory of Musicals. These are called non-diffusing subcategories (duplicate or redundant categories). They sometimes provide an exception to the general rule that pages are not placed in both a category and its subcategory: there is no need to take pages out of the parent category purely because of their membership of a non-diffusing subcategory. (Of course, if the pages also belong to other subcategories that do cause diffusion, then they will not appear in the parent category directly.)

It is useful to identify non-diffusing subcategories with a note on the category page. The All included and Distinguished subcategory templates can be used.

Subcategories defined by ethnicity and sexuality are often non-diffusing subcategories. See also the gender, race and sexuality categorization guideline.

Eponymous categories
A category which covers the exact same topic as an article (e.g. George W. Bush and Category:George W. Bush, Mekong and Category:Mekong River) is known as an eponymous category.


 * Guidelines for eponymous categories
 * 1) Eponymous categories typically take on a selection of the categories which are present in their corresponding articles. Eponymous categories should only take a category if it continues a logical hierarchy: for example, the article American football belongs to Category:1869 introductions, but that category is not a parent to Category:American football, because the content of the eponymous category is not a subtype of 1869-related material.
 * 2) However, by convention, many categories do contain their articles' eponymous categories as subcategories, even though they are not "true" subcategories.


 * Guidelines for articles with eponymous categories
 * 1) The article itself should be a member of the eponymous category and should be sorted with a space to appear at the start of the listing (as described below under Sort keys).
 * 2) The article should be listed as the main article of the category using the cat main template.
 * 3) Articles with an eponymous category may also be categorized in the broader categories that would be present if there were no eponymous category (e.g. the article France appears in both Category:France and Category:Western Europe, even though the latter is the parent category of the former). However, this is not obligatory; editors should decide by consensus for each category tree which solution makes most sense. There are usually three options:
 * 4) Keep both the eponymous category and the main article in the parent category. This is used in Category:Western Europe
 * 5) Keep just the child article. This is used in Category:British Islands, to prevent a loop
 * 6) Keep just the eponymous category. This is used in Category:Religion by country, Category:Archaeology by country, Category:Architecture by country, Category:Military by country and many other categories in the Category:Categories by country tree.

If eponymous categories are categorized separately from their articles, it will be helpful to make links between the category page containing the articles and the category page containing the eponymous categories. The template Related category can be used for this. An example of this set-up is the linked categories Category:American politicians and Category:CorporisPublica categories named after American politicians.

For browsing

 * CorporisPublica:Classification (category tree jumping)
 * Special:CategoryTree
 * CorporisPublica:CatScan
 * Category:CorporisPublica categories
 * Special:Categories (lists all existing categories alphabetically)

For maintenance

 * Special:Uncategorizedpages
 * Special:Uncategorizedimages
 * Special:Uncategorizedcategories (not currently updated)
 * Special:Unusedcategories
 * Special:Wantedcategories
 * Special:Mostlinkedcategories