Avoid repetition of the rows

I have this table In postgres database. First 2 rows (highlighted in gray) have the same data in these columns (user_id, criteria_01, criteria_02, criteria_03) but they have different value in date column.
the 3rd, 4th, 5th rows also has the same info in columns(user_id, criteria_01, criteria_02, criteria_03) and they have different date value.
since there are 2 PKs in the table (id and date), table accepts new entry with the same id and different date.
Data imported from an external server to fill this table and date column is filled base on the date of the new import (it also include time that I ignored here). If there is an import for today then date column get the value of today’s date. Data can be imported every 3 minutes
I need to redesign the structure of the tables. Is there any way to prevent the repetition of rows which have the same info all columns except date column? ( if I delete date column then repetition will be gone, but date is needed for visualization tool that they use).
if there is no way that’s fine I just want to know, if I can change it or not? This is what the company asked me to do.

Can I add another column like end date to prevent this repetition? Or is there any way to prevent it?

1 Like

Hi,

Apologies for not answering sooner but the holidays are upon us.

I’m not entirely certain I understood your question but here’s a little example that will hopefully help you.

The answer is to use a UNIQUE key/index:

-- GIVEN THE FOLLOWING
create table t1 (
    user_id bigint,
    c1 int,
    c2 int,
    c3 varchar(100),
    date timestamp with time zone,
    primary key (user_id,date)
);
-- SUGGESTED SOLUTION:
create table t2 (
    user_id bigserial primary key,
    c1 int not null,
    c2 int not null,
    c3 varchar(100) not null,
    date_start date not null,
    date_end date default now(),
    unique (c1, c2, c3, date_start)
);

-- INSERT EXAMPLE

-- this works
insert into t2 (user_id, c1, c2, c3, date_end, date_start)
    values (default, 1, 1, 'hello', '26 nov 2022'::date, now());

-- this gets rejected
insert into t2 (user_id, c1, c2, c3, date)
    values (default, 1, 1, 'hello', '28 nov 2022'::date, now());

Hope this helps :slight_smile:

1 Like

Thanks for your answer.
the goal is to reduce redundancy. it is time series data.
currently it is possible to have same columns Criteria(01, 02,03) with different dates. I want to reduce redundancy.

1 Like

Glad I was able to help. Don’t hesitate to reach out if you have any other questions.

1 Like