PDA

View Full Version : Database/programming question


ThunderingHERD
04-28-2007, 05:22 PM
[Irrelevant backstory]

I'm involved in research attempting to find links between certain endophenotypes consistent with schizophrenia and various SNPs of certain genes. Part of the process involves having the participants fill out a large # of questionnaires, some of which were originally intended to be orally administered by the experimenter. Early on I suggested that it would be much more efficient, and create much less variance on the experimenter adminstered stuff (particularly with regards to "sensitive" questions), to have all of the questionnaires filled out on a computer.

A few days later the prof emails out a link to a survey site that he's gotten demo account on, asking us to enter the questionnaires on the site. It quickly becomes apparent, however, that there are too many branches and contingencies in certain questionnaires to make this feasible. I searched for other survey sites and software and couldn't find anything that did exactly what we needed to do (unfortunately, I wasn't aware of MS Infopath at the time, which seems like it would work well). So I tell the professor that I had enough (rusty) programming knowledge to come up with something that would do exactly what we needed.

So right now I have the program working with 3 the three rather simple questionnaires. If you're interested, want to point out how much I suck at programming, or criticize my awful database design, you can download it at http://www.researchingen.com/files/question.zip (the menu is accessed with the "!" button and the password is "open"; "?"s provide help).

Anyway, I've run into some problems that have been causing me to put off working out the two remaining questionnaires--the ones that really needed to be computerized in the first place. The first problem is that they're just very poorly designed (from a pychometric standpoint), and it's discouraging to even look at them.

[Actual question]

The second problem is that, though I'm relatively competent at VB.NET/2005, I have no previous experience with database design or programming. I figured out enough to get through the first 3 questionnaires, as they were rather simple, but I have some issues with these final two.

I want to store as much of the raw data as possible, but also want to have it in a format where it's easy to compute a final score (we haven't actually decided on how to score it yet, which makes the whole thing more frustrating). For example, one question asks the participant what sports they played as a child, and presents a rather long list of checkboxes with various sports. Internally, the sports are split into three different categories. What would be the best way of storing this in a database? Creating a seperate field for each sport? Given that this sort of question comes up several times, this would create a massive number of fields. Should I create a seperate table for each question like this, and create a field for each response in that table? Or is there some better way to do this?

Any suggestions are appreciated.

MikeVic
04-28-2007, 06:20 PM
I think a separate boolean column for each option is ok? Otherwise, you can have an nvarchar or ntext column that stores all the options together in a way like this:

"0; 1; 0; 1;"

Then when you're processing the data, you'd have to make sure you split the data line and to determine a user's options... but this way makes it kind of hard to add more sports options.

Antmeister
04-29-2007, 01:08 AM
Hopefully I have understood what you were trying to convey. Now from my understanding, you are looking to tailor a questionnaire based on the sports that are chosen and minimize the amount of data that the database is storing. If this is the case, I would do the following things.


I would add three additional tables:
SPORTS
QUESTIONS
SPORTS_QUESTIONS
PARTICIPANTS_SPORTS

The SPORTS table would contain:
Sport_ID (Primary key)
Sport

The QUESTIONS table would contain:
Question_ID (Primary key)
Question

The SPORTS_QUESTIONS table would contain:
Sports_ID (Primary key)
Question_ID (Primary key)
Answers (I am just assuming this is some boolean value for the correct answer. I placed it here because I don't know if the answer will differ for each sport. If this is not the case, move the field to the question table)
Value (Since you are not quite sure how they will be scoring this, I placed this here some that you could give the question some sort of integer to represent a value)

The PARTICIPANTS_SPORTS table would contain:
Participant_ID (Primary Key)
Sport_ID (Primary Key)
Final_Score


Based on the sports that the participant chooses, you can narrow down what question are asked from the SPORTS_QUESTIONS table and the results will be stored in the PARTICIPANTS_SPORTS table. The SPORTS and QUESTIONS table are more for reference and can be updated in the future.

ThunderingHERD
04-29-2007, 08:03 PM
Thanks a lot, both of you. Exactly the info I needed!