之前有说过H.264比较复杂,我是打算一块一块来分析。这里是学习笔记以及自己的理解,很多可能是摘抄的相关资料,感谢原始作者!理解错误的地方也希望大家指出。
注意:目前处于草稿状态,不定期更新!
在了解了H.264编码的基础介绍之后(Overview of the H.264/AVC Video Coding Standard,H.264-MPEG-4 Part 10 White Paper,ISO_IEC_14496-10_2012,新一代视频压缩编码标准H.264,H.264码流结构解析等等有非常多的资料,这里就不一一列举了,网上有很多,然后天之骄子/firstime还有李世平/Peter Lee这些前辈写的文章),这么多资料我也是没有完全看懂,我也是采取能看懂的尽量看懂,不能看懂的多看几遍(这些资料交换看/翻来覆去的看有帮助你理解原来不懂的东西),反正是对H.264有了个大概的理解。
我看这类编/解码有个习惯就是从编/解码出来的数据文件下手。因为无论如何复杂,最终的文件(字节流)肯定是符合某种规范的。
如果你打算继续往下看,你对H.264的理解基本上应该知道VCL和NAL是什么东西,知道Exp-Golomb/Huffman编码,码流(bitstream)是由一个个的NALU组成等等这一类的基本知识。
首先我们希望有种可以分析H.264码流的工具,这样可以让我们直观的了解,但是不幸的是类似这样的工具都太贵,多数许可证要上千美金,不过幸好我们可以找到21天试用版或者缺少功能的演示版。
我在网上搜索了很多相关的工具,比如CodecVisa,StreamEye以及VM Analyzer,这些都可以玩玩。
还有个问题,码流文件去哪里找?我在网上一个人的网站看到有下载,但是解压需要密码,我给它写信,期盼着他能给我密码,不幸的是他还没有回,然后我就继续找,其实可以不用这么麻烦,我们可以自己生成码流文件,JM代码里面有个foreman_part_qcif.yuv,你用它的lencod.exe命令先把它编码成H.264码流,这样你就还得先了解下JM是什么,如何编译,如何使用,不过这些都很简单。YUV视频序列也可以到网络上去下载,这个很多。
下载YUV视频序列 http://trace.eas.asu.edu/yuv/
H.264测试模型/文档 http://iphome.hhi.de/suehring/tml/
http://wftp3.itu.int/av-arch/jvt-site/
李世平 http://blog.csdn.net/sunshine1314
天之骄子 http://bbs.chinavideo.org/viewthread.php?tid=988
这里我查看码流用的工具是StreamEye,码流文件是foreman_part_qcif.yuv编码得来的。
这些工具怎么用就不介绍了,无非就是看视频有哪些/类帧组成,各个属性(Header/MacroBlock/Picture)是什么。
下面这段信息是从Headers Info拷贝出来的,粗看下它是SPS,PPS和Slice Header,那这有什么用,这些数据是什么意思?
[00]seq_parameter_set_rbsp() { profile_idc = 66 (Baseline) constraint_set0_flag = 0 (false) constraint_set1_flag = 0 (false) constraint_set2_flag = 0 (false) constraint_set3_flag = 0 (false) constraint_set4_flag = 0 (false) constraint_set5_flag = 0 (false) reserved_zero_2bits = 0 level_idc = 30 seq_parameter_set_id = 0 if (profile_idc == 100 || profile_idc == 110 || profile_idc == 122 || profile_idc == 144) { chroma_format_idc = na if (chroma_format_idc == 3) separate_colour_plane_flag = na bit_depth_luma_minus8 = na bit_depth_chroma_minus8 = na qpprime_y_zero_transform_bypass_flag = na seq_scaling_matrix_present_flag = na if (seq_scaling_matrix_present_flag) for (i = 0; i < 8; i++) { seq_scaling_list_present_flag[0] = na if (seq_scaling_list_present_flag[0]) scaling_list_4x4[00] = na scaling_list_4x4[01] = na scaling_list_4x4[02] = na scaling_list_4x4[03] = na scaling_list_4x4[04] = na scaling_list_4x4[05] = na scaling_list_4x4[06] = na scaling_list_4x4[07] = na scaling_list_4x4[08] = na scaling_list_4x4[09] = na scaling_list_4x4[10] = na scaling_list_4x4[11] = na scaling_list_4x4[12] = na scaling_list_4x4[13] = na scaling_list_4x4[14] = na scaling_list_4x4[15] = na seq_scaling_list_present_flag[1] = na if (seq_scaling_list_present_flag[1]) scaling_list_4x4[00] = na scaling_list_4x4[01] = na scaling_list_4x4[02] = na scaling_list_4x4[03] = na scaling_list_4x4[04] = na scaling_list_4x4[05] = na scaling_list_4x4[06] = na scaling_list_4x4[07] = na scaling_list_4x4[08] = na scaling_list_4x4[09] = na scaling_list_4x4[10] = na scaling_list_4x4[11] = na scaling_list_4x4[12] = na scaling_list_4x4[13] = na scaling_list_4x4[14] = na scaling_list_4x4[15] = na seq_scaling_list_present_flag[2] = na if (seq_scaling_list_present_flag[2]) scaling_list_4x4[00] = na scaling_list_4x4[01] = na scaling_list_4x4[02] = na scaling_list_4x4[03] = na scaling_list_4x4[04] = na scaling_list_4x4[05] = na scaling_list_4x4[06] = na scaling_list_4x4[07] = na scaling_list_4x4[08] = na scaling_list_4x4[09] = na scaling_list_4x4[10] = na scaling_list_4x4[11] = na scaling_list_4x4[12] = na scaling_list_4x4[13] = na scaling_list_4x4[14] = na scaling_list_4x4[15] = na seq_scaling_list_present_flag[3] = na if (seq_scaling_list_present_flag[3]) scaling_list_4x4[00] = na scaling_list_4x4[01] = na scaling_list_4x4[02] = na scaling_list_4x4[03] = na scaling_list_4x4[04] = na scaling_list_4x4[05] = na scaling_list_4x4[06] = na scaling_list_4x4[07] = na scaling_list_4x4[08] = na scaling_list_4x4[09] = na scaling_list_4x4[10] = na scaling_list_4x4[11] = na scaling_list_4x4[12] = na scaling_list_4x4[13] = na scaling_list_4x4[14] = na scaling_list_4x4[15] = na seq_scaling_list_present_flag[4] = na if (seq_scaling_list_present_flag[4]) scaling_list_4x4[00] = na scaling_list_4x4[01] = na scaling_list_4x4[02] = na scaling_list_4x4[03] = na scaling_list_4x4[04] = na scaling_list_4x4[05] = na scaling_list_4x4[06] = na scaling_list_4x4[07] = na scaling_list_4x4[08] = na scaling_list_4x4[09] = na scaling_list_4x4[10] = na scaling_list_4x4[11] = na scaling_list_4x4[12] = na scaling_list_4x4[13] = na scaling_list_4x4[14] = na scaling_list_4x4[15] = na seq_scaling_list_present_flag[5] = na if (seq_scaling_list_present_flag[5]) scaling_list_4x4[00] = na scaling_list_4x4[01] = na scaling_list_4x4[02] = na scaling_list_4x4[03] = na scaling_list_4x4[04] = na scaling_list_4x4[05] = na scaling_list_4x4[06] = na scaling_list_4x4[07] = na scaling_list_4x4[08] = na scaling_list_4x4[09] = na scaling_list_4x4[10] = na scaling_list_4x4[11] = na scaling_list_4x4[12] = na scaling_list_4x4[13] = na scaling_list_4x4[14] = na scaling_list_4x4[15] = na seq_scaling_list_present_flag[6] = na if (seq_scaling_list_present_flag[6]) scaling_list_8x8[00] = na scaling_list_8x8[01] = na scaling_list_8x8[02] = na scaling_list_8x8[03] = na scaling_list_8x8[04] = na scaling_list_8x8[05] = na scaling_list_8x8[06] = na scaling_list_8x8[07] = na scaling_list_8x8[08] = na scaling_list_8x8[09] = na scaling_list_8x8[10] = na scaling_list_8x8[11] = na scaling_list_8x8[12] = na scaling_list_8x8[13] = na scaling_list_8x8[14] = na scaling_list_8x8[15] = na scaling_list_8x8[16] = na scaling_list_8x8[17] = na scaling_list_8x8[18] = na scaling_list_8x8[19] = na scaling_list_8x8[20] = na scaling_list_8x8[21] = na scaling_list_8x8[22] = na scaling_list_8x8[23] = na scaling_list_8x8[24] = na scaling_list_8x8[25] = na scaling_list_8x8[26] = na scaling_list_8x8[27] = na scaling_list_8x8[28] = na scaling_list_8x8[29] = na scaling_list_8x8[30] = na scaling_list_8x8[31] = na scaling_list_8x8[32] = na scaling_list_8x8[33] = na scaling_list_8x8[34] = na scaling_list_8x8[35] = na scaling_list_8x8[36] = na scaling_list_8x8[37] = na scaling_list_8x8[38] = na scaling_list_8x8[39] = na scaling_list_8x8[40] = na scaling_list_8x8[41] = na scaling_list_8x8[42] = na scaling_list_8x8[43] = na scaling_list_8x8[44] = na scaling_list_8x8[45] = na scaling_list_8x8[46] = na scaling_list_8x8[47] = na scaling_list_8x8[48] = na scaling_list_8x8[49] = na scaling_list_8x8[50] = na scaling_list_8x8[51] = na scaling_list_8x8[52] = na scaling_list_8x8[53] = na scaling_list_8x8[54] = na scaling_list_8x8[55] = na scaling_list_8x8[56] = na scaling_list_8x8[57] = na scaling_list_8x8[58] = na scaling_list_8x8[59] = na scaling_list_8x8[60] = na scaling_list_8x8[61] = na scaling_list_8x8[62] = na scaling_list_8x8[63] = na seq_scaling_list_present_flag[7] = na if (seq_scaling_list_present_flag[7]) scaling_list_8x8[00] = na scaling_list_8x8[01] = na scaling_list_8x8[02] = na scaling_list_8x8[03] = na scaling_list_8x8[04] = na scaling_list_8x8[05] = na scaling_list_8x8[06] = na scaling_list_8x8[07] = na scaling_list_8x8[08] = na scaling_list_8x8[09] = na scaling_list_8x8[10] = na scaling_list_8x8[11] = na scaling_list_8x8[12] = na scaling_list_8x8[13] = na scaling_list_8x8[14] = na scaling_list_8x8[15] = na scaling_list_8x8[16] = na scaling_list_8x8[17] = na scaling_list_8x8[18] = na scaling_list_8x8[19] = na scaling_list_8x8[20] = na scaling_list_8x8[21] = na scaling_list_8x8[22] = na scaling_list_8x8[23] = na scaling_list_8x8[24] = na scaling_list_8x8[25] = na scaling_list_8x8[26] = na scaling_list_8x8[27] = na scaling_list_8x8[28] = na scaling_list_8x8[29] = na scaling_list_8x8[30] = na scaling_list_8x8[31] = na scaling_list_8x8[32] = na scaling_list_8x8[33] = na scaling_list_8x8[34] = na scaling_list_8x8[35] = na scaling_list_8x8[36] = na scaling_list_8x8[37] = na scaling_list_8x8[38] = na scaling_list_8x8[39] = na scaling_list_8x8[40] = na scaling_list_8x8[41] = na scaling_list_8x8[42] = na scaling_list_8x8[43] = na scaling_list_8x8[44] = na scaling_list_8x8[45] = na scaling_list_8x8[46] = na scaling_list_8x8[47] = na scaling_list_8x8[48] = na scaling_list_8x8[49] = na scaling_list_8x8[50] = na scaling_list_8x8[51] = na scaling_list_8x8[52] = na scaling_list_8x8[53] = na scaling_list_8x8[54] = na scaling_list_8x8[55] = na scaling_list_8x8[56] = na scaling_list_8x8[57] = na scaling_list_8x8[58] = na scaling_list_8x8[59] = na scaling_list_8x8[60] = na scaling_list_8x8[61] = na scaling_list_8x8[62] = na scaling_list_8x8[63] = na } } } log2_max_frame_num_minus4 = 0 (4) pic_order_cnt_type = 0 if (pic_order_cnt_type == 0) log2_max_pic_order_cnt_lsb_minus4 = 0 (4) else if (pic_order_cnt_type == 1) { delta_pic_order_always_zero_flag = na offset_for_non_ref_pic = na offset_for_top_to_bottom_field = na num_ref_frames_in_pic_order_cnt_cycle = na for(i = 0; i < num_ref_frames_in_pic_order_cnt_cycle; i++) } max_num_ref_frames = 10 gaps_in_frame_num_value_allowed_flag = 0 pic_width_in_mbs_minus1 = 10 (176) pic_height_in_map_units_minus1 = 8 (144) frame_mbs_only_flag = 1 if (!frame_mbs_only_flag) mb_adaptive_frame_field_flag = na direct_8x8_inference_flag = 0 (false) frame_cropping_flag = 0 (false) if (frame_cropping_flag) { frame_crop_left_offset = na frame_crop_right_offset = na frame_crop_top_offset = na frame_crop_bottom_offset = na } vui_parameters_present_flag = 0 (false) if (vui_parameters_present_flag) } vui_parameters() } [00]pic_parameter_set_rbsp() { pic_parameter_set_id = 0 seq_parameter_set_id = 0 entropy_coding_mode_flag = 0 (CAVLC) pic_order_present_flag = 0 (false) num_slice_groups_minus1 = 0 (1) if (num_slice_groups_minus1 > 0) { slice_group_map_type = na if (slice_group_map_type == 0) for (iGroup = 0; iGroup <= num_slice_groups_minus1; iGroup++) { } else if (slice_group_map_type == 2) for (iGroup = 0; iGroup < num_slice_groups_minus1; iGroup++) { } else if ((slice_group_map_type == 3) || (slice_group_map_type == 4) || (slice_group_map_type == 5)) { slice_group_change_direction_flag = na slice_group_change_rate_minus1 = na } else if (slice_group_map_type == 6) { pic_size_in_map_units_minus1 = na for (i = 0; i <= pic_size_in_map_units_minus1; i++) } } num_ref_idx_l0_active_minus1 = 9 (10) num_ref_idx_l1_active_minus1 = 9 (10) weighted_pred_flag = 0 (false) weighted_bipred_idc = 0 pic_init_qp_minus26 = 0 (26) pic_init_qs_minus26 = 0 (26) chroma_qp_index_offset = 0 deblocking_filter_control_present_flag = 0 (false) constrained_intra_pred_flag = 0 (false) redundant_pic_cnt_present_flag = 0 (false) } if (more_rbsp_data()) { transform_8x8_mode_flag = na pic_scaling_matrix_present_flag = na if (pic_scaling_matrix_present_flag) { for (i = 0; i < 6 + 2 * transform_8x8_mode_flag; i++) { pic_scaling_list_present_flag[0] = na if (pic_scaling_list_present_flag[0]) scaling_list_4x4[0][00] = na scaling_list_4x4[0][01] = na scaling_list_4x4[0][02] = na scaling_list_4x4[0][03] = na scaling_list_4x4[0][04] = na scaling_list_4x4[0][05] = na scaling_list_4x4[0][06] = na scaling_list_4x4[0][07] = na scaling_list_4x4[0][08] = na scaling_list_4x4[0][09] = na scaling_list_4x4[0][10] = na scaling_list_4x4[0][11] = na scaling_list_4x4[0][12] = na scaling_list_4x4[0][13] = na scaling_list_4x4[0][14] = na scaling_list_4x4[0][15] = na pic_scaling_list_present_flag[1] = na if (pic_scaling_list_present_flag[1]) scaling_list_4x4[1][00] = na scaling_list_4x4[1][01] = na scaling_list_4x4[1][02] = na scaling_list_4x4[1][03] = na scaling_list_4x4[1][04] = na scaling_list_4x4[1][05] = na scaling_list_4x4[1][06] = na scaling_list_4x4[1][07] = na scaling_list_4x4[1][08] = na scaling_list_4x4[1][09] = na scaling_list_4x4[1][10] = na scaling_list_4x4[1][11] = na scaling_list_4x4[1][12] = na scaling_list_4x4[1][13] = na scaling_list_4x4[1][14] = na scaling_list_4x4[1][15] = na pic_scaling_list_present_flag[2] = na if (pic_scaling_list_present_flag[2]) scaling_list_4x4[2][00] = na scaling_list_4x4[2][01] = na scaling_list_4x4[2][02] = na scaling_list_4x4[2][03] = na scaling_list_4x4[2][04] = na scaling_list_4x4[2][05] = na scaling_list_4x4[2][06] = na scaling_list_4x4[2][07] = na scaling_list_4x4[2][08] = na scaling_list_4x4[2][09] = na scaling_list_4x4[2][10] = na scaling_list_4x4[2][11] = na scaling_list_4x4[2][12] = na scaling_list_4x4[2][13] = na scaling_list_4x4[2][14] = na scaling_list_4x4[2][15] = na pic_scaling_list_present_flag[3] = na if (pic_scaling_list_present_flag[3]) scaling_list_4x4[3][00] = na scaling_list_4x4[3][01] = na scaling_list_4x4[3][02] = na scaling_list_4x4[3][03] = na scaling_list_4x4[3][04] = na scaling_list_4x4[3][05] = na scaling_list_4x4[3][06] = na scaling_list_4x4[3][07] = na scaling_list_4x4[3][08] = na scaling_list_4x4[3][09] = na scaling_list_4x4[3][10] = na scaling_list_4x4[3][11] = na scaling_list_4x4[3][12] = na scaling_list_4x4[3][13] = na scaling_list_4x4[3][14] = na scaling_list_4x4[3][15] = na pic_scaling_list_present_flag[4] = na if (pic_scaling_list_present_flag[4]) scaling_list_4x4[4][00] = na scaling_list_4x4[4][01] = na scaling_list_4x4[4][02] = na scaling_list_4x4[4][03] = na scaling_list_4x4[4][04] = na scaling_list_4x4[4][05] = na scaling_list_4x4[4][06] = na scaling_list_4x4[4][07] = na scaling_list_4x4[4][08] = na scaling_list_4x4[4][09] = na scaling_list_4x4[4][10] = na scaling_list_4x4[4][11] = na scaling_list_4x4[4][12] = na scaling_list_4x4[4][13] = na scaling_list_4x4[4][14] = na scaling_list_4x4[4][15] = na pic_scaling_list_present_flag[5] = na if (pic_scaling_list_present_flag[5]) scaling_list_4x4[5][00] = na scaling_list_4x4[5][01] = na scaling_list_4x4[5][02] = na scaling_list_4x4[5][03] = na scaling_list_4x4[5][04] = na scaling_list_4x4[5][05] = na scaling_list_4x4[5][06] = na scaling_list_4x4[5][07] = na scaling_list_4x4[5][08] = na scaling_list_4x4[5][09] = na scaling_list_4x4[5][10] = na scaling_list_4x4[5][11] = na scaling_list_4x4[5][12] = na scaling_list_4x4[5][13] = na scaling_list_4x4[5][14] = na scaling_list_4x4[5][15] = na } } second_chroma_qp_index_offset = na } } [00]slice_header() { nal_unit_header_svc_extension() { idr_flag = na priority_id = na no_inter_layer_pred_flag = na dependency_id = na quality_id = na temporal_id = na use_ref_base_pic_flag = na discardable_flag = na output_flag = na } first_mb_in_slice = 0 slice_type = 7 (I slice) pic_parameter_set_id = 0 frame_num = 0 if (!frame_mbs_only_flag) { field_pic_flag = na if (field_pic_flag) bottom_field_flag = na } if (nal_unit_type == 5) idr_pic_id = 0 if (pic_order_cnt_type == 0) { pic_order_cnt_lsb = 0 if (pic_order_present_flag && !field_pic_flag) delta_pic_order_cnt_bottom = na } if (pic_order_cnt_type == 1 && !delta_pic_order_always_zero_flag) { delta_pic_order_cnt[0] = na if (pic_order_present_flag && !field_pic_flag) delta_pic_order_cnt[1] = na } if (redundant_pic_cnt_present_flag) redundant_pic_cnt = na if (slice_type == B) direct_spatial_mv_pred_flag = na if (slice_type == P || slice_type == SP || slice_type == B) { num_ref_idx_active_override_flag = na if (num_ref_idx_active_override_flag) { num_ref_idx_l0_active_minus1 = na if (slice_type == B ) num_ref_idx_l1_active_minus1 = na } } if (nal_unit_type == 20) ref_pic_list_mvc_modification() else ref_pic_list_modification() if ((weighted_pred_flag && (slice_type == P || slice_type == SP)) || (weighted_bipred_idc == 1 && slice_type == B)) pred_weight_table() if (nal_ref_idc != 0) dec_ref_pic_marking() if (entropy_coding_mode_flag && slice_type != I && slice_type != SI) cabac_init_idc = na slice_qp_delta = 2 if (slice_type == SP || slice_type == SI) { if (slice_type == SP) sp_for_switch_flag = na slice_qs_delta = na } if (deblocking_filter_control_present_flag) { disable_deblocking_filter_idc = na if (disable_deblocking_filter_idc != 1) { slice_alpha_c0_offset_div2 = na slice_beta_offset_div2 = na } } if (num_slice_groups_minus1 > 0 && slice_group_map_type >= 3 && slice_group_map_type <= 5) slice_group_change_cycle = na }
我们现在来分析,我们知道码流是由一个个的NAL Unit组成的,NALU是由NALU头和RBSP数据组成,而RBSP可能是SPS,PPS,Slice或SEI,目前我们这里SEI不会出现,而且SPS位于第一个NALU,PPS位于第二个NALU,其他就是Slice(严谨点区分的话可以把IDR等等再分出来)了。foreman_part_qcif.yuv只有3帧,那这里编码出来是不是就有5个NALU?我们这里可以大胆假设,然后仔细验证。
NALU头是什么东西,参见Spec的7.3 Syntax in tabular form,如果你有看过天之骄子的文章,就知道在Spec的7.3和7.4是相对应的,所以这两部分都要看,而且7.3就是编码算法的伪代码实现。C(ategory)和Descriptor都要熟悉,f(1),u(2),b(8),ue(v)等等是什么意思,这些在7.2和9.1都有详细说明,大概说下这里比如ue(v),se(v)等等这样的就是Exp-Golomb编码,f(1),u(2)这里就是通常的按位数,比如2位的无符号整数,32位的整数等等。
所以7.3和7.4一定要看明白,只有看明白了才能在码流基础上分析H.264。
现在我们来开始分析,下面是一段H.264码流文件的十六进制数据,所以你得有个十六进制编辑器。
00 00 00 01 67 42 00 1E F1 61 62 62 00 00 00 01 68 C8 A1 43 88 00
我们知道00 00 00 01是NALU的开始标记,所以你打开这个完整的码流文件应该可以看到5个00 00 00 01,所以这就是我们之前说的有5个NALU,分别是SPS,PPS和3个Slice。
先贴段数据,这是Spec(Table 7-1 – NAL unit type codes, syntax element categories, and NAL unit type classes)规定的,NALU的类型,现在我们只要看看SPS,PPS,IDR和Slice就行。
#define NALU_TYPE_SLICE 1 #define NALU_TYPE_DPA 2 #define NALU_TYPE_DPB 3 #define NALU_TYPE_DPC 4 #define NALU_TYPE_IDR 5 #define NALU_TYPE_SEI 6 #define NALU_TYPE_SPS 7 #define NALU_TYPE_PPS 8 #define NALU_TYPE_AUD 9 #define NALU_TYPE_EOSEQ 10 #define NALU_TYPE_EOSTREAM 11 #define NALU_TYPE_FILL 12
A) 我们先看第一个NALU的RBSP(8个字节)
67 42 00 1E F1 61 62 62
转换成二进制流
01100111 01000010 00000000 00011110 11110001 01100001 01100010 01100010
先看NALU头
forbidden_zero_bit
nal_ref_idc
nal_unit_type
这三个属性共占8位(Spec上都有写,分别占1,2和5位),那我们对着解析下就看出
forbidden_zero_bit = 0 // 0
nal_ref_idc = 3 // 11
nal_unit_type = 7 // 00111
这就对了,看看
#define NALU_TYPE_SPS 7
Spec当中后面有些放在if判断里的就是只有符合某个值的时候才会出现,我们这里nal_unit_type为7,不符合,所以直接跳过,进入到RBSP当中,这里是SPS,所以对照Spec
profile_idc
constraint_set0_flag
constraint_set1_flag
constraint_set2_flag
constraint_set3_flag
constraint_set4_flag
constraint_set5_flag
reserved_zero_2bits
level_idc
seq_parameter_set_id
这几个属性,直到seq_parameter_set_id之前都还比较好解析,我们就直接写出它们的值了
profile_idc = 66 // 01000010
constraint_set0_flag = 0 // 0
constraint_set1_flag = 0 // 0
constraint_set2_flag = 0 // 0
constraint_set3_flag = 0 // 0
constraint_set4_flag = 0 // 0
constraint_set5_flag = 0 // 0
reserved_zero_2bits = 0 // 00
level_idc = 30 // 00011110
对于seq_parameter_set_id,我们看到它是ue(v),这是一种Exp-Golomb编码,每个编码所占的位数不是固定的,我们现在还剩下的数据是11110001 01100001 01100010 01100010。
公式参考Spec(9.1 Parsing process for Exp-Golomb codes),
leadingZeroBits = −1 for (b = 0; !b; leadingZeroBits++) b = read_bits(1) codeNum = 2^(leadingZeroBits) − 1 + read_bits(leadingZeroBits)
类似于2^k这种写法表示幂运算
过程就是读取1位,在这里结果是1,所以会跳出循环,但是leadingZeroBits++还是会执行,所以leadingZeroBits为0,后面read_bits也不会读取数据了。
codeNum = 2^0 – 1 + 0 = 0
也就是说编码为1的属性实际值为0
seq_parameter_set_id = 0 // Exp-Golomb解1
同样后面是if判断不会走到,现在直接到
log2_max_frame_num_minus4 = 0 // Exp-Golomb解1
pic_order_cnt_type = 0 // Exp-Golomb解1
log2_max_pic_order_cnt_lsb_minus4 = 0 // Exp-Golomb解1
max_num_ref_frames = 10 // 这里二进制流从0001开始了(前面的4个1被上面4个属性用掉了),所以有leadingZeroBits为3,结果就是2^3 – 1 + read_bits(011)
gaps_in_frame_num_value_allowed_flag = 0 // 0
pic_width_in_mbs_minus1 = 10 // Exp-Golomb解0001 011
pic_height_in_map_units_minus1 = 8 // Exp-Golomb解00010 01
对于Exp-Golomb不明白的请参见Exponential-Golomb coding解码部分,剩下的
frame_mbs_only_flag = 1 // 1
direct_8x8_inference_flag = 0 // 0
frame_cropping_flag = 0 // 0
vui_parameters_present_flag = 0 // 0
还剩下10两个位的数据没有用到,之前的这么多数据(除了NALU头之外的)都是seq_parameter_set_data,而根据Spec我们知道还有结尾补齐位
seq_parameter_set_rbsp( ) { seq_parameter_set_data( ) // 数据 rbsp_trailing_bits( ) // 按字节补齐 }
补齐规则参见7.3.2.11 RBSP trailing bits syntax,实际就是按照字节对齐来补齐,所以这就是10这两位数据的由来。
回头看起来,这就是SPS的数据,也就是第一个NALU,同前面从Headers Info拷贝出来的SPS也是完全吻合的,所以这里我们就算是把Spec和实际的用法/码流对照起来了。另外值得说一下的就是从Headers Info拷贝出来的数据当中”na”就是未定义的,也就是if条件没有覆盖的情况。
B) 现在我们以同样的方式来看PPS(5个字节)
68 C8 A1 43 88
转换成二进制流
01101000 11001000 10100001 01000011 10001000
同样先看NALU头,解析结果如下
forbidden_zero_bit = 0 // 0
nal_ref_idc = 3 // 11
nal_unit_type = 8 // 01000
也就对应于
#define NALU_TYPE_PPS 8
就可以知道此处的RBSP是PPS
pic_parameter_set_id = 0 // Exp-Golomb解1
seq_parameter_set_id = 0 // Exp-Golomb解1
entropy_coding_mode_flag = 0 // 0
bottom_field_pic_order_in_frame_present_flag = 0 // 0
num_slice_groups_minus1 = 0 // Exp-Golomb解1
num_ref_idx_l0_default_active_minus1 = 9 // Exp-Golomb解000 1010
num_ref_idx_l1_default_active_minus1 = 9 // Exp-Golomb解0001 010
weighted_pred_flag = 0 // 0
weighted_bipred_idc = 0 // 00
pic_init_qp_minus26 = 0 // Exp-Golomb解1
pic_init_qs_minus26 = 0 // Exp-Golomb解1
chroma_qp_index_offset = 0 // Exp-Golomb解1
deblocking_filter_control_present_flag = 0 // 0
constrained_intra_pred_flag = 0 // 0
redundant_pic_cnt_present_flag = 0 // 0
还剩下1000这四位,这就是按字节补齐的数据。
C) 这就是Slice开始的数据了
先看部分数据(前4个字节)
65 88 84 02
转换成二进制流
01100101 10001000 10000100 00000010
同样先看NALU头,解析结果如下
forbidden_zero_bit = 0 // 0
nal_ref_idc = 3 // 11
nal_unit_type = 5 // 00101
也就对应于
#define NALU_TYPE_IDR 5
可以知道这个是IDR帧(关于什么是IDR,IDR和I片有什么区别)
first_mb_in_slice = 0 // Exp-Golomb解1
slice_type = 7 // Exp-Golomb解0001000 也就是I slice,关于slice_type请参考Table 7-6 – Name association to slice_type
pic_parameter_set_id = 0 // Exp-Golomb解1
frame_num = 0 // u(v)根据占用的位数(log2_max_frame_num_minus4 + 4)解出值 // 0000
对于frame_num这个属性要特别说下,它的Descriptor是u(v),那么我们查看u(v)得知
u(n): unsigned integer using n bits. When n is “v” in the syntax table, the number of bits varies in a manner dependent on the value of other syntax elements.
也就是说这个属性占用的位数是取决于其它属性的,那就再搜索下frame_num得到
frame_num is used as an identifier for pictures and shall be represented by log2_max_frame_num_minus4 + 4 bits in the bitstream.
于是我们就大概清楚了,frame_num占用的位数跟log2_max_frame_num_minus4相关,之前在SPS当中我们知道log2_max_frame_num_minus4 = 0,所以这里frame_num占用4位,也就是0000,解析出来也就是0,另外也需要知道frame_num有很多限制,比如在IDR当中必须为0,具体参见7.4.3 Slice header semantics。这里要指出的是,这是一份完整优秀的Spec,基本上已经涵盖了我们需要的所有东西,只是需要我们去找,去分析(尽管这个过程可能很麻烦,有时让人摸不着头脑,但是需要相信我们需要的答案就在里面)。
idr_pic_id = 0 // Exp-Golomb解1
pic_order_cnt_lsb = 0 // u(v)根据占用的位数(log2_max_pic_order_cnt_lsb_minus4 + 4)解出值 // 0000
剩下的000010
现在要进入ref_pic_list_modification( )这个function了,但是里面所有if判断条件不符合
然后进入dec_ref_pic_marking( )
no_output_of_prior_pics_flag = 0 // 0
long_term_reference_flag = 0 // 0
现在只剩下0010这四位了,我们继续补充3个字节(63 61 7C)进来01100011 01100001 01111100
于是我们继续做slice_qp_delta的解码,注意这里它的Descriptor是se(v),所以要先对进Exp-Golomb解码,然后进行mapping得出值。
0010 01100011 01100001 01111100 // 这里两个0,求出Exp-Golomb编码值为00100 // 长度5,后缀为0可以被解析成2 实际可以通过Exp-Golomb(2^2 – 1 + 0)算出值为3 然后代入(-1)^(k + 1) * Ceil(k divide 2)求出值为2。详细可以参见9.1.1 Mapping process for signed Exp-Golomb codes。
slice_qp_delta = 2 // 00100 // se(v)
到这里Slice Header就解析完成了。
暂时就到这里,需要说明的是,我们只写出了前三个NALU部分解析方法(第一个Slice,也就是IDR,我们只写出了Header部分,还有数据部分我们留到后面来分析),还剩两个Slice我们留着有必要的时候来分析。
2 thoughts on “一步一步解析H.264码流的NALU(SPS,PSS,IDR)”