Trouble with hashed tables
Monday, October 02, 2006
Today I learned the hard way that hashed internal tables work not exactly as I was expecting. What we (or at least I) learn from, say, BC400 class or the ABAP reference is that there are standard, sorted and hashed internal tables. Standard tables are kind of all-purpose, the sorted ones are better for LOOP AT ... WHERE and hashed tables are good if you need to do READ TABLE with unique key. And for some reason I was assuming that if I read some data that has duplicates into a hashed table it will be nicely populated and duplicates will simply be skipped. Well, assumption is mother of all screw-ups, as they say. Very true.
In my defense, I actually went through my BC400 materials and ABAP reference and could not find any clues on this, so here is some info on how this actually works. Let’s say you’re trying to get a list of all the deliveries and material numbers and you want the unique numbers only. Here is a bad idea example:
TYPES: BEGIN OF deliveries,
vbeln TYPE vbeln,
matnr TYPE matnr,
END OF deliveries.
DATA: i_deliveries TYPE HASHED TABLE OF deliveries
WITH UNIQUE KEY vbeln matnr.
SELECT vbeln matnr
INTO TABLE i_deliveries
FROM lips.
This program will end with a dump if there is any VBELN with more than one record with the same MATNR. However, this disaster can be easily avoided by changing SELECT to SELECT DISTINCT. Another option (depending on your task) would be to SELECT into a standard table, then do SORT, DELETE ADJACENT DUPLICATES and copy the content to a hashed table. This seems a bit redundant (most likely SELECT DISTINCT is going to work faster) but might be necessary sometimes, you never know.
Also be careful when doing, for example,
i_deliveries_hashed[] = i_deliveries[].
(Here i_deliveries is a standard table and i_deliveries_hashed is a hashed table.) If there are records in i_deliveries with duplicates (based on the hashed table key), this will also fall into a short dump. Good old SORT and DELETE ADJACENT DUPLICATES will help here as well.
posted by Your Friendly ABAPer @ 20:42, Direct link to this post
2 Comments:
- At 24/2/11 07:17, said...
-
When using SELECT Distinct, the buffer is bypassed. So even when the sort/delete adjacent thing seems to be redundant, it might be the better solution.
- At 28/3/11 23:31, said...
-
It really depends on the particular task and therefore in every situation it's up to the developer to make the right choice. YMMV, so to say.