Avoid repetition of the rows

farmit · November 26, 2022, 10:19pm

I have this table In postgres database. First 2 rows (highlighted in gray) have the same data in these columns (user_id, criteria_01, criteria_02, criteria_03) but they have different value in date column.
the 3rd, 4th, 5th rows also has the same info in columns(user_id, criteria_01, criteria_02, criteria_03) and they have different date value.
since there are 2 PKs in the table (id and date), table accepts new entry with the same id and different date.
Data imported from an external server to fill this table and date column is filled base on the date of the new import (it also include time that I ignored here). If there is an import for today then date column get the value of today’s date. Data can be imported every 3 minutes
I need to redesign the structure of the tables. Is there any way to prevent the repetition of rows which have the same info all columns except date column? ( if I delete date column then repetition will be gone, but date is needed for visualization tool that they use).
if there is no way that’s fine I just want to know, if I can change it or not? This is what the company asked me to do.

Can I add another column like end date to prevent this repetition? Or is there any way to prevent it?

Robert_Bernier · November 28, 2022, 3:53pm

Hi,

Apologies for not answering sooner but the holidays are upon us.

I’m not entirely certain I understood your question but here’s a little example that will hopefully help you.

The answer is to use a UNIQUE key/index:

-- GIVEN THE FOLLOWING
create table t1 (
    user_id bigint,
    c1 int,
    c2 int,
    c3 varchar(100),
    date timestamp with time zone,
    primary key (user_id,date)
);

-- SUGGESTED SOLUTION:
create table t2 (
    user_id bigserial primary key,
    c1 int not null,
    c2 int not null,
    c3 varchar(100) not null,
    date_start date not null,
    date_end date default now(),
    unique (c1, c2, c3, date_start)
);

-- INSERT EXAMPLE

-- this works
insert into t2 (user_id, c1, c2, c3, date_end, date_start)
    values (default, 1, 1, 'hello', '26 nov 2022'::date, now());

-- this gets rejected
insert into t2 (user_id, c1, c2, c3, date)
    values (default, 1, 1, 'hello', '28 nov 2022'::date, now());

Hope this helps

farmit · November 29, 2022, 12:02am

Thanks for your answer.
the goal is to reduce redundancy. it is time series data.
currently it is possible to have same columns Criteria(01, 02,03) with different dates. I want to reduce redundancy.

Robert_Bernier · November 29, 2022, 2:13pm

Glad I was able to help. Don’t hesitate to reach out if you have any other questions.

Topic		Replies	Views
Consult(query) repeating data PostgreSQL closed-no-reply	0	444	October 31, 2023
Mysql data partioning and trimming Other MySQL® Questions	0	373	June 24, 2012
Query help Other MySQL® Questions	7	541	January 6, 2009
Maintaining Central DB Data Sequence Other MySQL® Questions	2	491	February 28, 2010
Why are the queryid different? pg_stat_monitor	6	88	January 8, 2025

Avoid repetition of the rows

Related topics