

**Fig. 1** Performances of LDPC codes generated by semi-random parity check matrixes with k = 30000

• 
$$R = 1/3$$
  
•  $R = 1/2$   
•  $R = 2/3$ 

Simulation study: Fig. 1 contains the simulated performances of the proposed encoding method for various rates (1/3, 1/2, 2/3) using t = 4. The decoding algorithm follows that in [2]. The results are essentially the same as those obtained using fully random H.

*Conclusion:* It has been shown that a semi-random approach to LDPC code design can achieve essentially the same performance as the existing method with considerably reduced complexity.

© IEE 1999 23 November 1998 Electronics Letters Online No: 19990065

Li Ping and W.K. Leung (Department of Electronic Engineering, City University of Hong Kong, Hong Kong)

E-mail: eeliping@cityu.edu.hk

Nam Phamdo (Department of Electrical and Computer Engineering, State University of New York at Stony Brook, Stony Brook, NY 11794-2350, USA)

### References

- GALLAGER, R.G.: 'Low density parity check codes', IRE Trans. Inf. Theory, 1962, IT-8, pp. 21-28
- 2 MacKAY, D.J.C., and NEAL, R.M.: 'Near Shannon limit performance of low density parity check codes', *Electron. Lett.*, 1997, 33, (6), pp. 457-458
- 3 PROAKIS, J.G.: 'Digital communications' (McGraw-Hill, 1995)
- 4 PETERSON, W.W., and WELDON, E.J., Jr.: 'Error-correcting codes' (MIT Press, Cambridge, Massachusetts, 1972) 2nd edn.

# Non-binary convolutional codes for turbo coding

C. Berrou and M. Jézéquel

The authors consider the use of non-binary convolutional codes in turbo coding. It is shown that quaternary codes can be advantageous, both from performance and complexity standpoints, but that higher-order codes may not bring further improvement.

Introduction: Turbo codes are error correcting codes with at least two dimensions (i.e. each datum is encoded at least twice). The decoding of turbo codes is based on an iterative procedurc using the concept of extrinsic information. Fig. 1 gives an example of a two-dimensional turbo code built from a parallel concatenation of two identical recursive systematic convolutional (RSC) codes with generators 15, 13 (octal notation). The global (non-iterative) decoding of such a code is too complex to be envisaged because of the very large number of states induced by the interleaver. An iterative procedure is therefore used, the two codes being decoded alternately in their own dimensions and the two associated

ELECTRONICS LETTERS 7th January 1999 Vol. 35 No. 1

decoders passing the result of their work to each other, at each iterative step.



Fig. 1 Two-dimension turbo code with generators 15, 13

Binary codes versus quaternary codes: Fig. 2a represents a block of size k encoded by the code of Fig. 1. This block is seen as a twodimensional  $\sqrt{k} \times \sqrt{k}$  block and for simplicity we consider that the interleaver is a regular one: the sequence is encoded first by  $C_1$ , following the horizontal or linewise dimension, and secondly by  $C_2$ , following the vertical or columnwise dimension. The dashes on both dimensions symbolise the path error packets at the output of the two decoders, at a particular step of the iterative process. These packets do not contain only erroneous decisions but they indicate where a wrong path has been chosen either by the decoder of  $C_1$  or by the decoder of  $C_2$ . This corresponds to a certain path error density per dimension, which is the same in both dimensions if the component codes are identical.



**Fig. 2** *Path error packets in turbo decoding a* Binary codes

b Quaternary codes



Fig. 3 8-state quaternary recursive systematic convolutional (RSC) code with generators 15, 13

The performance of turbo decoding is strongly dependent on the path error density per dimension. Obviously, the more numerous and the longer the horizontal and vertical dashes in the square box are, the harder the convergence to the correct codeword is. Now for each component code, replace the binary code of Fig. 1 by the quaternary code of Fig. 3. The data are thus encoded and interleaved in couples. The size of the block is k/2 couples and the square box now has the dimensions  $\sqrt{k/2} \times \sqrt{k/2}$  (Fig. 2b). When a decoder selects a path in the decoding trellis, the same amount of information is used in the cases of both binary and quaternary codes, therefore with half the number of transitions in the case of a quaternary code, giving path error packets which are half the length. Recalling that each dimension of the square block has been divided by  $\sqrt{2}$  and not by 2, the path error density per dimension has been divided by  $\sqrt{2}$ . This leads to slightly better performance, which was already reported in [1], even in comparison with 16-state turbo codes, which are the reference for practical turbo codes (see for instance [2]).

 Table 1: Comparison of computation needs in a 16-state/2-path decoder and in an 8-state/4-path decoder

|                                                       | 16-state/2-path | 8-state/4-path |
|-------------------------------------------------------|-----------------|----------------|
| Transition metrics to compute                         | 4               | 16             |
| Node metrics memory                                   | 16 registers    | 8 registers    |
| Adders for the computation<br>of node metrics         | 32              | 32             |
| Comparators for the selection<br>of node metrics      | 16              | 24             |
| Adders for the computation<br>of the soft-output      | 32              | 32             |
| Comparators for the computation<br>of the soft-output | 30              | 28             |

Complexity standpoint: Table 1 compares succinctly the combinatory decoding complexity of a 16-state, 2-path per node decoder and that of a 8-state, 4-path per node decoder, corresponding to the code of Fig. 3 for instance. The decoding algorithm considered is the max-log-APP algorithm, also called dual Viterbi decoder [3], whose complexity is roughly twice the complexity of the Viterbi algorithm. Only one processor for forward and backward recursions is considered, but the comparison is still valid. Finally, coding rates for component codes are assumed to be equal to or greater than 1/2. On a first reading, the complexity appears to be of the same order for decoding a binary soft-output in a 16-state trellis and a quaternary soft-output in a 8-state trellis. In fact, taking the size of quantised numbers into account, especially in transition metrics, we can consider that the quaternary decoder is around 30% more complex. If we now refer to the complexity per decoded bit, the quaternary decoder is simpler by 35%. So the conclusion is dependent both on the material support, DSP or ASIC, and on the data rate. However, generally speaking, the 8state/4-path decoder seems to be more attractive to design. Note also that non-binary decoders have been considered in the literature for a long time for resolving high data rate decoding implementations [4].

Other *m*-ary codes: Can we extend the principle of quaternary codes to decrease the path error density per dimension to *m*-ary codes with m > 4? With quaternary 8-state codes, the shortest wrong path in a decoder is of length 2. Because it is the minimum possible value, increasing *m* is of no interest for this purpose, unless we use codes with more than 8 states. But, unfortunately, the performance of turbo decoding for low signal to noise ratios is debased when adopting large constraint lengths (> 5). This results from the construction of extrinsic information which becomes less and less reliable as the number of states increases. So the relevance of *m*-ary codes (m > 4), relatively to the path error densities, would be diminished by some deterioration in the iterative process.

*Conclusion:* 8-state quaternary codes appear to be good candidates for turbo coding, from the points of view of performance, complexity and throughputs. Encoding data by couples of bits divides the block size by a factor of two. Despite this apparent handicap, the path error density per dimension is lowered, which explains the improvement in performance.

#### © IEE 1999

Cedex, France)

23 November 1998

Electronics Letters Online No: 19990059 C. Berrou and M. Jézéquel (ENST de Bretagne, BP 832, 29285 Brest

E-mail: claude.berrou@enst-bretagne.fr

#### References

- 1 BERROU, C.: 'Some clinical aspects of turbo codes'. Proc. International Symposium on Turbo Codes & Related Topics, Brest, September 1997, pp. 26–31
- 2 'Draft CCSDS recommendation for telemetry channel coding (updated to include turbo codes)'. Consultative Committee for Space Data Systems, Rev. 4, May 1998
- 3 ROBERTSON, P., HOEHER, P., and VILLEBRUN, E.: 'Optimal and suboptimal maximum a posteriori algorithms suitable for turbo decoding', *Eur. Trans. Telecommun.*, 1997, 8, pp. 119–125
  4 FETTWEIS, G., and MEYR, H.: 'Parallel Viterbi algorithm
- 4 FETTWEIS, G., and MEYR, H.: 'Parallel Viterbia algorithm implementation: breaking the ACS-bottleneck', *IEEE Trans. Commun.*, 1989, 37, (8), pp. 785–790

## 46GHz bandwidth monolithic InP/InGaAs pin/SHBT photoreceiver

D. Huber, M. Bitter, T. Morf, C. Bergamaschi,

H. Melchior and H. Jäckel

An InGaAs *pin*-photodetector and a lumped SHBT transimpedance preamplifier have been monolithically integrated and characterised. The preamlifier achieves a transimpedance gain of 44.6 dBQ ( $170\Omega$ ) and the optical/electrical -3 dB bandwidth of the entire receiver is 46GHz, which is the highest bandwidth for any HBT based photoreceiver reported to date.

Introduction: The monolithic integration of photodetectors and preamplifiers in InP-based materials has been the focus of research in industrial and academic laboratories owing to the ease of fabrication. Because the InGaAs base-collector junction of an HBT can be used to form the photodiode, no additional epitaxial and processing steps are necessary for this cointegration. Furthermore, the InGaAs pin-diode serves as a detector for light at 1.55µm wavelength the interconnection parasitics between diode and preamplifier can be reduced to a minimum, and high-performance transistors are available. Previously, we presented a similar receiver with outstanding results for sensitivity, gain and band-width [1]. Our new generation of improved SHBT and a reduction in transimpedance gain to  $170\Omega$  led to the high bandwidth of 46GHz being presented in this Letter. This development step is a consequence of the fact that, with the introduction of Er-doped fibre amplifiers (EDFAs) as optical preamplifiers, the requirements for sensitivity and gain are relaxed and operating speed has become one of the most important issues.

Device structure and fabrication: The self-aligned InP/InGaAs single-hetero junction bipolar-transistor (SHBT) process has been described in detail in [2]. The process provides transistors having an emitter area of  $1.0 \times 5 \mu m^2$ , a  $9 \mu m$  diameter photodiode with an antireflection coating, thin-film  $50\Omega/\Box$  Cr resistors and supply blocking capacitors formed by base-collector layer depletion capacitances. The MOVPE-grown device layer structure is the same as described in [1] with the exceptions of reduced base- and collector-layer thicknesses of 50 and 400nm, respectively. The latter results from a tradeoff between the collector transit time of the HBT on one side and responsivity, and the parasitic capacitance of the diode on the other side. To extend the RC-limited photodiode bandwidth to well over 40GHz, a preamplifier with low input impedance is necessary. Therefore, we choose a common-base input stage followed by a transimpedance amplifier stage and an emitter-follower output stage (Fig. 1) as originally proposed in [3].

*Experimental results:* All measurements were performed on-wafer. Optical characterisation was carried out using a lensed singlemode fibre for top illumination. The DC responsivity of the photodiode including coating losses is R = 0.32 A/W, the paracitic capacitance  $C_{dep} = 70 \text{ FF}$  and the -3dB bandwidth into a 50 $\Omega$  load was measured to be  $f_{-3dB} = 30 \text{ GHz}$ . The HBTs exhibit a DC current gain of  $\beta = 25$ , a transit frequency of  $f_T = 130 \text{ GHz}$  and a maximum oscillation frequency of  $f_{max} = 270 \text{ GHz}$  at a collector current density of  $J_c = 1.3 \text{ mA} \text{ µm}^2$  and a collector-emitter voltage range of  $V_{ce} = 1.5$ –2V. The DC transimpedance of the preamplifier was measured