Quantcast
Channel: schema design - Stack Overflow
Viewing all articles
Browse latest Browse all 12

Answer by Jonathan Leffler for schema design

$
0
0

@mson has asked the question "What do you do when a question is not satisfactorily answered on SO?", which is a direct reference to the existing answers to this question.

I contributed the following answer to that discussion, primarily critiquing the way the question was asked.


Quote (verbatim):

I looked at the original question yesterday, and decided not to contribute an answer.

One problem was the use of the term 'model' as in 'GM models' - which cited 'Chevrolet, Saturn, Cadillac' as 'models'. To my understanding, these are not models at all; they are 'brands', though there might also be an industry-insider term for them that I'm not familiar with, such as 'division'. A model would be a 'Saturn Vue' or 'Chevrolet Impala' or 'Cadillac Escalade'. Indeed, there could well be models at a more detailed level than that - different variants of the Saturn Vue, for example.

So, I didn't think that the starting point was well framed. I didn't critique it; it wasn't quite compelling enough, and there were answers coming in, so I let other people try it.

The next problem is that it is not clear what your DBMS is going to be storing as data. If you're storing a million records per 'model' ('brand'), then what sorts of data are you dealing with? Lurking in the background is a different scenario - the real scenario - and your question has used an analogy that failed to be sufficiently realistic. That means that the 'it depends' parts of the answer are far more voluminous than the 'this is how to do it' ones. There is just woefully too little background information on the data to be modelled to allow us to guess what might be best.

Ultimately, it will depend on what uses people have for the data. If the information is going to go flying off in all different directions (different data structures in different brands; different data structures at the car model levels; different structures for the different dealerships - the Chevrolet dealers are handled differently from the Saturn dealers and the Cadillac dealers), then the integrated structure provides limited benefit. If everything is the same all the way down, then the integrated structure provides a lot of benefit.

Are there legal reasons (or benefits) to segregating the data? To what extent are the different brands separate legal entities where shared records could be a liability? Are there privacy issues, such that it will be easier to control access to the data if the data for the separate brands is stored separately?

Without a lot more detail about the scenario being modelled, no-one can give a reliable general answer - at least, not more than the top-voted one already gives (or doesn't give).

  • Data modelling is not easy.
  • Data modelling without sufficient information is impossible to do reliably.

I have copied the material here since it is more directly relevant. I do think that to answer this question satisfactorily, a lot more context should be given. And it is possible that there needs to be enough extra context to make SO the wrong place to ask it. SO has its limitations, and one of those is that it cannot deal with questions which require long explanations.

From the SO FAQs page:

What kind of questions can I ask here?

Programming questions, of course! As long as your question is:

  • detailed and specific
  • written clearly and simply
  • of interest to at least one other programmer somewhere

...

What kind of questions should I not ask here?

Avoid asking questions that are subjective, argumentative, or require extended discussion. This is a place for questions that can be answered!

This question is, IMO, close to the 'require extended discussion' limit.


Viewing all articles
Browse latest Browse all 12

Trending Articles