Frequently Asked Questions
This section is designed to answer commonly posed questions. Please select a question:
If you believe a question has been answered incorrectly or with bias, please get in touch.
Punjabi (sometimes spelt Panjabi) is a language spoken predominantly on the Indian sub-continent by approximately 60 million people. There are two variants: eastern Punjabi spoken by people from India and western Punjabi spoken by people from Pakistan. In India, Punjabi is an official state language and is written in Gurmukhi script. In Pakistan, Punjabi is written in a script called Shahmukhi. Even though Punjabi is spoken by approximately 30 million people (representing the single largest linguistic group), it is not an official language of Pakistan. Due to the promotion of Urdu in Pakistan, Punjabi's formal use in Pakistan is considerably less so than in India.
There are also large numbers of Punjabi speakers in countries around the world (especially the UK, Canada and United States). These are predominately immigrants or descendents of immigrants from the Indian sub-continent.
Gurmukhi is by far the most predominant script used for writing Punjabi in eastern (Indian) Punjab. The Gurmukhi script is derived from the Landa alphabet and was standardised by Guru Angad Dev in the 16th century. It forms a part of the Brahmi script family.
Gurmukhi should use the locale type 'pa' or 'pa-IN'.
Shahmukhi is based on Arabic script (reading from right-to-left) and is the predominant script for writing Punjabi in western (Pakistan) Punjab. This web site concentrates more towards the Gurmukhi side of Punjabi. Users wishing to use Punjabi in Shahmukhi are advised to research Arabic (and Urdu) computing.
Shahmukhi should use the locale type 'pa-PK'.
Unicode is the international standard whose goal is to specify a code matching every character needed by every written human language to a single code point (integer). In terms of Indic languages, it provides support for 9 different scripts:
Devanagari (Hindi, Marathi, Sanskrit), Bengali (Bengali, Assamese), Gurmukhi (Punjabi), Gujarati, Oriya, Tamil, Telugu, Kannada and Malayalam.
Unicode provides the first well implemented standard for using Indic scripts on computers across the world.
ISCII is the Indian Script Code for Information Interchange. It was developed by the Indian government to represent Indic scripts uniformly across multiple platforms. It was difficult to implement and did not have widespread backing. Unicode was based on the ISCII standard and currently has much better support.
The use of ISCII on new projects is NOT recommended!
Unicode is the international standard for data interchange. It is slowly but surely replacing all other standards used across the world. With a Unicode compatible computer system you can:
Unicode is a good solution for Punjabi - although it does have its downfalls:
The Punjabi Computing Resource Centre provides software to enable you to migrate font-based Gurmukhi to Unicode. The program in question - the Gurmukhi Unicode Conversion Application - is free and available to download right now! We also have resources that help you make Unicode compatible web sites.
The most popular Unicode Gurmukhi font at the moment is Raavi which is supplied with Microsoft Windows. Due to licensing restrictions, we are not allowed to redistribute the font file. In response to this, we have created our own freely distributable font called Saab.
At present, all Indic scripts use the Danda and Double Danda in the Devanagari block at U+0964 and U+0965 respectively. This poses no problem because Unicode enables you to use any characters from different blocks. One example is the use of Latin punctuation in Gurmukhi text.
The Inscript keyboard layout uses the Devanagari Danda at U+0964.
For historic and transliteration reasons, Unicode encodes independent vowel forms seperately. That means they cannot be created using a combination of the components (e.g. Iri + Lavan).
In some literary texts, Era + Lavan is used in place of Iri + Lavan. After much research, we have concluded that this is indeed a mistake and should be replaced with Iri + Lavan. If you must use this combination, it can be created by using a ZWJ (Zero Width Joiner). That is, Era (U+0A05) + ZWJ (U+200D) + Lavan (U+0A47) to give 'ਅੇ'. This method is not recommended because not all applications and programs handle the use of ZWJ correctly.
Unicode does not encode Paireen or subjoined characters on their own. To encode a subjoined character, you simply enter the barer consonant, the Virama (Halant) sign and finally the full form of the subjoined character you wish to use.
For example, to type 'pra', you would enter ਪ + '੍' + ਰ to give ਪ੍ਰ. If your computer does not have the associated glyph for a particular subjoined form, your computer will display the Virama followed by the full form of the following character (like ਪ੍ਰ). You can force your computer to display the full form by entering a ZWNJ (Zero Width Non-Joiner - U+200C) after the Virama.
Unicode for Indic languages is based on ISCII. ISCII encoded nine different Indic scripts and provided a mechanism to easily switch between the scripts. This enabled users to view any Indian language text in the script of their choice. This was possible because of the many similarities between Brahmi-based scripts.
Due to this, all Indic scripts had equivalent characters at the same codepoint which is based on the Devanagari block. Because of this basis on the Devanagari block, the order does not seem correct to Gurmukhi readers.
Unicode provides a special algorithm and guides for sorting and ordering text. The software you are using must implement this in order to correctly sort Gurmukhi text.
At present, Unicode Gurmukhi is geared at typing modern Punjabi. It has not been implemented with archaic forms of Gurmukhi in mind. Older and Sanskritised forms of Gurmukhi break with some modern Gurmukhi conventions which makes implementing them particularly troublesome because Unicode rendering engines heavily enforce rules on Indian scripts.
Such rules that are troublesome are:
These rules do not conflict with modern Gurmukhi – in fact, they complement modern Gurmukhi – but they cause huge difficulties when a user wishes to enter text in a form that breaks with convention.
Many of the problems can be overcome with sporadic use of ZWS (Zero Width Space), ZWJ (Zero Width Joiner) and ZWNJ (Zero Width Non-Joiner). For example, Onkar on ਓ can be created as follows: (Your browser may have trouble rendering this sequence.)
However the use of these special characters cannot be a long term solution and needs to be addressed.
The PCRC has been formulating proposals for many months now and is urgently looking for experts to contribute. If you have in-depth knowledge of non-conventional forms of Gurmukhi and its relation to Sanskrit, Persian and other languages please contact us.
These proposals should go some way to addressing most of the issues present. However they may never be able to address stylistic differences such as using a Bindi/Tippi both before and after Bihari.
Copyright © 2004-2005 Sukhjinder Sidhu. All rights reserved.